Products
Developers
About

SambaNova DataScale

The Fastest, Most Efficient Hardware Platform for Generative and Agentic AI

SambaNova DataScale is the only fully integrated hardware-software system that enables organizations to train, fine tune, and deploy the most demanding AI workloads. Achieve world record inference performance with the largest and most challenging models, all with dramatically reduced power requirements compared to other systems.

SambaChip_SN40L_Composite_R1_600x300

 

Powered by the SambaNova SN40L reconfigurable dataflow unit (RDU), the SambaNova DataScale SN40L delivers unprecedented performance across all model sizes to enable government agencies, research organizations, and enterprises to quickly deploy more and larger models, with greater accuracy, all with a smaller footprint than GPUs.

The fastest inference platform

Delivering world record performance and accuracy, across the latest large and small models with the highest accuracy

Reduced power consumption

Run dozens of models and switch between them in microseconds, on a single rack that only consumes 11kWs.

Designed for scale

Start with as little as one node and a few models and scale efficiently.


The DataScale system takes advantage of the unique SambaNova SN40L Reconfigurable Dataflow Unit (RDU)  to deliver exceptional performance in a small footprint. The SN40L is able to deliver this extreme performance thanks to its revolutionary dataflow architecture and large memory footprint.

RDU_Dataflow_Architecture_486x230

Dataflow architecture

The SN40L is purpose-built for AI. Breaking free from the limitations of legacy technologies, the SN40L uses a dataflow architecture and innovative software stack that maps AI algorithms to the processor and dynamically reconfigures the processor for optimal performance. This eliminates the redundancy inherent to GPU architectures, resulting in significantly greater performance on a fraction of the hardware footprint. 

Three tiered memory architecture

Purpose-built to power the largest AI models, the SN40L has a three tiered memory architecture that includes very large memory, high bandwidth memory, and very fast memory. The result is that a single system node can support up to 5 trillion parameters consisting of up to hundreds of separate models. With TBs of addressable memory, the SN40L is ideal for custom and chained models, and can switch between models in microseconds which is orders of magnitude faster than legacy GPUs.

DataFlow_stack-584x424


The industry’s most advanced software stack

SambaNova DataScale features a complete software stack designed to take input from standard machine learning frameworks like PyTorch. Compile, optimize, and execute your models without the need for low level tuning.

The fastest system for inference

The DataScale SN40L delivers the performance, flexibility, scale, and efficiency to power the inference workloads of today and tomorrow. It delivers world record performance and accuracy across Llama 3 8B, 70B, and 405B models. It is the only system that delivers high performance for the 405B model. Other systems either cannot power models that large, or in the case of GPUs, the SN40L is 5x faster.

The most efficient platform

The SambaNova DataScale SN40L is the only platform that can power dozens of models on a single system at the same time, with unrivaled performance and accuracy. The SN40L delivers higher inference speed, with a smaller footprint, and only a fraction of the power consumption compared to other systems.

The most flexible system for fine tuning

Proven by some of the most demanding customer environments in the world, the DataScale SN40L provides outstanding performance for model training, while eliminating the need for low level model tuning. Users can bring their own custom models or Llama 3 checkpoints and securely fine tune them with their private data.