BACK TO RESOURCES

DeepSeek-V3.1 Is Live on SambaCloud!

by SambaNova
August 25, 2025

Vibe code with lightning-fast DeepSeek-V3.1 on SambaCloud!

Just last week, DeepSeek dropped their latest model: DeepSeek-V3.1. Launching today, DeepSeek-V3.1 is now available for developers on SambaCloud — running over 200 tokens/second!

 



DeepSeek-V3.1 shares the same architecture as the original DeepSeek R1 that caused waves earlier in the year and has been available on SambaCloud since February. In this update, the model now introduces a hybrid thinking mode, which enables developers to switch between reasoning and non-reasoning modes (see code snippet below). According to Artificial Analysis, both modes of this model show significant improvements in intelligence and in particular show major improvements in coding capabilities!

AA DeepSeek_V3.1 Benchmark

 

Why DeepSeek-V3.1?

DeepSeek-V3.1 shows significant improvements across various benchmarks compared to prior iterations of both the R1 and V3 models for thinking and non-thinking modes, respectively. On the Aider polyglot benchmark, which specializes in measuring a model's ability to write and edit code, the V3.1 thinking model scored 76.3 compared to Claude Opus 4 at 72. In short, DeepSeek-V3.1 delivers much better performance for a fraction of the price.

Benchmark Metrics DeepSeek
V3.1-NonThinking
DeepSeek
V3 0324
DeepSeek
V3.1-Thinking
DeepSeek
0528
LiveCodeBench (2408-2505)  56.4 43.0 74.8 73.3
Aider-Polyglot (Acc.) 68.4 55.1 76.3 71.6
SWE Verified (Agent mode) 66.0 45.4 * 44.6
SWE-bench Multilingual (Agent mode) 54.5 29.3 * 30.5
Terminal-bench (Terminus 1 framework) 31.3 13.3 * 5.7

* Only DeepSeek V3.1 Non-Thinking supports function calling, which is why there is no information for these benchmarks. Despite this, DeepSeek-V3.1 Non-Thinking scores better than DeepSeek R1-0528 on these benchmarks.

These benchmarks highlight the model's significant improvements in coding and suggest that it is best suited for coding agents like in the case of Blackbox or integrating into coding assistants like Cline. Moreover, DeepSeek has improved the function calling capabilities of this model in non-thinking mode, which make it even better for use with agentic frameworks like CrewAI.

Best of all, as an open-source model, developers and enterprises have the flexibility to privately and securely deploy these models on the hardware and in the locations that best suit their applications. SambaCloud servers are based in the U.S. and Japan, so no data is sent to China when using the model in our cloud. With SambaNova, developers can start building on our cloud today and scale to meet their customers' requirements with hardware deployed on-premises or even at the edge. Moreover, developers save more by running open-source models at a fraction of the price compared to closed source alternatives.

Model Input Price per Million Tokens Output Price per Million Tokens
DeepSeek-V3.1 on SambaCloud $3* $4.50*
Claude Opus 4.1 $15 $75
Grok 4 $3 $15
OpenAI o3 $3 $15

* Custom pricing options are available for Enterprise tier offerings.

Start building with relentless intelligence in minutes on SambaCloud

SambaCloud is a powerful platform that enables developers to easily integrate the best open-source models with the fastest inference speeds. Powered by our state-of-the-art AI chip, the SN40L, SambaCloud provides a seamless and efficient way to build AI applications with fast inference. Get started today and experience the benefits of fast inference speeds, maximum accuracy, and an enhanced developer experience, in just three easy steps!

sambanova_Hardware_Chip2.1_1000x1000_72dpi_NoBG

 

  1. Head over to SambaCloud and create your own account
  2. Get an API Key
  3. Make your first API call with our open AI compatible API

curl -H "Authorization: Bearer $API_KEY" \
   -H "Content-Type: application/json" \
   -d '{
    "stream": true,
    "chat_template_kwargs" : { "enable_thinking": true },
    "model": "DeepSeek-V3.1",
    "messages": [
        {
            "role": "system",
            "content": "You are a helpful assistant"
        },
        {
            "role": "user",
            "content": "Hello"
        }
    ]
    }' \
   -X POST https://api.sambanova.ai/v1/chat/completions

 

To learn more about how to use the model and how to use it in various frameworks, developers can read more on our documentation portal. This includes documentation using function calling with integrations into coding assistants like Cline.