Just last week, DeepSeek dropped their latest model: DeepSeek-V3.1. Launching today, DeepSeek-V3.1 is now available for developers on SambaCloud — running over 200 tokens/second!
DeepSeek-V3.1 shares the same architecture as the original DeepSeek R1 that caused waves earlier in the year and has been available on SambaCloud since February. In this update, the model now introduces a hybrid thinking mode, which enables developers to switch between reasoning and non-reasoning modes (see code snippet below). According to Artificial Analysis, both modes of this model show significant improvements in intelligence and in particular show major improvements in coding capabilities!
DeepSeek-V3.1 shows significant improvements across various benchmarks compared to prior iterations of both the R1 and V3 models for thinking and non-thinking modes, respectively. On the Aider polyglot benchmark, which specializes in measuring a model's ability to write and edit code, the V3.1 thinking model scored 76.3 compared to Claude Opus 4 at 72. In short, DeepSeek-V3.1 delivers much better performance for a fraction of the price.
Benchmark Metrics | DeepSeek V3.1-NonThinking |
DeepSeek V3 0324 |
DeepSeek V3.1-Thinking |
DeepSeek 0528 |
LiveCodeBench (2408-2505) | 56.4 | 43.0 | 74.8 | 73.3 |
Aider-Polyglot (Acc.) | 68.4 | 55.1 | 76.3 | 71.6 |
SWE Verified (Agent mode) | 66.0 | 45.4 | * | 44.6 |
SWE-bench Multilingual (Agent mode) | 54.5 | 29.3 | * | 30.5 |
Terminal-bench (Terminus 1 framework) | 31.3 | 13.3 | * | 5.7 |
* Only DeepSeek V3.1 Non-Thinking supports function calling, which is why there is no information for these benchmarks. Despite this, DeepSeek-V3.1 Non-Thinking scores better than DeepSeek R1-0528 on these benchmarks.
These benchmarks highlight the model's significant improvements in coding and suggest that it is best suited for coding agents like in the case of Blackbox or integrating into coding assistants like Cline. Moreover, DeepSeek has improved the function calling capabilities of this model in non-thinking mode, which make it even better for use with agentic frameworks like CrewAI.
Best of all, as an open-source model, developers and enterprises have the flexibility to privately and securely deploy these models on the hardware and in the locations that best suit their applications. SambaCloud servers are based in the U.S. and Japan, so no data is sent to China when using the model in our cloud. With SambaNova, developers can start building on our cloud today and scale to meet their customers' requirements with hardware deployed on-premises or even at the edge. Moreover, developers save more by running open-source models at a fraction of the price compared to closed source alternatives.
Model | Input Price per Million Tokens | Output Price per Million Tokens |
DeepSeek-V3.1 on SambaCloud | $3* | $4.50* |
Claude Opus 4.1 | $15 | $75 |
Grok 4 | $3 | $15 |
OpenAI o3 | $3 | $15 |
* Custom pricing options are available for Enterprise tier offerings.
To learn more about how to use the model and how to use it in various frameworks, developers can read more on our documentation portal. This includes documentation using function calling with integrations into coding assistants like Cline.