Pushing the AI frontier with premium inference

Run the largest models by maximizing dataflow efficiency with
high speed and sustained throughput.

Connect with Experts

The first disaggregated inference demo for AI agents is live

At COMPUTEX, SambaNova demonstrated what the next era of AI inference looks like: Premium inference for AI agents powered by GPUs and RDUs, running live in the newly-announced VC2 data center for the first time.

home-rdu-bg

Introducing the SN50 RDU - our fifth-generation AI chip!

Only chip to deliver the speed and throughput required for agentic AI. Purpose-built for agentic inference delivering the best tokens per watt.

Inference stack by design

How RDU Dataflow Architecture works

Why Modern Al Infrastructure Demands Model Bundling

Not One-Model-Per-Node Thinking

Learn more

The Goldilocks Zone for agents

The SN50 delivers 3X the savings compared to competitive chips for agentic inference. Co-Founder and Chief Technologist Kunle Olukotun explains how SN50 tiered memory allows agents to have access to a cache for models and prompts, further improving efficiency.

Build with the best open-source models

Sovereign AI around the world

Meet our network of sovereign AI data center partners. Powered by SambaNova, each delivers top-tier performance and the flexibility of open source within their national borders.
AUSTRALIA
southerncrossai-v2
EUROPE
infercom-v2 ovhcloud
UNITED KINGDOM
argyll-v2
SambaStack

The only chips-to-model computing built for AI

OpenAI Compatible APIs SambaOrchestrator Reconfigurable Dataflow Unit (RDUs) SambaRack

Inference | Bring Your Own Checkpoints

SambaNova provides simple-to-integrate APIs for Al inference, making it easy to onboard applications. Our APIs are OpenAI compatible allowing you to port your application to
SambaNova in minutes.

 

Auto Scaling | Load Balancing | Monitoring | Model Management | Cloud Create | Server Management

SambaOrchestrator simplifies managing AI workloads across data centers. Easily monitor and manage model deployments and scale automatically to meet user demand.

 

SambaRack™ is a state-of-the-art system that can be set up easily in data centers to run Al inference workloads. SambaRack SN40-16 is our fourth generation system optimized for low power inference (average of 10 kWh) and running many models simultaneously.


SambaRack SN50 is our fifth-generation system optimized for fast agentic inference at a fraction of the cost running the largest models, like gpt-oss-120b and DeepSeek.

 

SN40 | SN50 RDU

At the heart of SambaNova's innovation lies the RDU. With a unique three-tier memory architecture and dataflow processing, RDU chips are able to achieve much faster inference using a lot less power than other architectures.

 
  • Complete AI platform that provides a fully integrated end-to-end agentic AI stack – spanning across agents, models, knowledge, and data.

  • Composable AI platform that is open, unifies structured and unstructured data, queries in any environment, and deploys on any AI model. Build or use pre-built AI agents — all with business-aware intelligence.

  • Sovereign AI platform that keeps data secure and governed while business teams query in any environment. IT stays in control, while business teams self-serve AI — and both can focus on what matters.

The First Disaggregated Inference Demo for AI Agents Is Live

The First Disaggregated Inference Demo for AI Agents Is Live

June 3, 2026
Building the Blueprint for Premium Inference

Building the Blueprint for Premium Inference

April 8, 2026
Introducing the SN50 RDU: Purpose-Built for Agentic Inference

Introducing the SN50 RDU: Purpose-Built for Agentic Inference

February 24, 2026