RDU | Next-Gen AI Chip for Inference at Scale

From chips to racks

SambaNova RDUs combine to create a single platform that can run the largest models. The fifth-generation SN50 RDUs can scale up to 256 chips across multiple racks and run models that are up to 10 trillion parameters in size and with a context length of up to 10 million tokens.

With the RDU as the heart of SambaRack, these systems can be seamlessly integrated into existing air-cooled data centers.

Learn more →

News

SambaNova Launches First Turnkey AI Inference Solution for Data Centers, Deployable in 90 Days

July 7, 2025

Blog

Why SambaNova's SN40L Chip Is the Best for Inference

September 10, 2024

News

SambaNova Launches its AI Platform in AWS Marketplace

May 29, 2025

64abaad9-f6b5-4701-8e5e-f5c27bd985d6_large

Speed

RDUs are the only solution that run the largest AI models on a single system with blazing fast performance.

Learn more →

Energy

RDUs deliver the highest tokens per kilowatt-hour, which is ideal for data centers of all sizes.

Learn more →

2025-07-14LogoMontage_460x260_144dpi_Opp3

Agentic

Three-tier memory architecture enables multiple models to run while switching between them. Perfect for AI agents.

Learn more →

FAQs

The SN50 RDU (Reconfigurable Dataflow Unit) is SambaNova’s fifth-generation AI inference processor, designed specifically for large-scale, agentic workloads. It uses its unique Dataflow technology and three-tier memory architecture to reduce data movement, enabling faster inference, lower latency, and improved energy efficiency compared to traditional accelerator designs.

GPUs are general-purpose accelerators designed to handle a wide range of compute workloads, primarily for training. The SambaNova RDU is purpose-built for inference and uses Dataflow architecture and three-tier memory architecture that maps model execution directly onto the processor, minimizing data movement to memory, which is the most expensive component for AI inference.

The SN50 is the latest generation of SambaNova’s RDU, offering higher compute performance, increased network bandwidth, and improved scalability compared to the SN40. While the SN40 is well-suited for existing inference deployments and power-constrained environments, the SN50 is designed for large-scale, agentic AI workloads. It enables faster token generation, better system throughput, and more efficient multi-model execution.

The SN50 supports a wide range of inference-heavy AI workloads that require low latency, high throughput, and efficient memory usage. These include AI agents, coding assistants, enterprise copilots, conversational AI, retrieval-augmented generation (RAG), and model hosting platforms. It is particularly well-suited to agentic workflows involving multi-step reasoning, tool usage, and frequent model switching.

Yes, the SN50 is designed to run multiple models simultaneously using its tiered memory architecture. This allows models to remain resident in memory and be switched quickly with minimal latency. The capability is especially important for agentic workloads that rely on multiple models across task steps, improving responsiveness, utilization, and overall inference efficiency.

The SN50 scales to large models through a combination of memory architecture and distributed deployment. Multiple racks can be interconnected to form inference clusters, enabling support for larger models, higher concurrency, and predictable performance at scale.

Reconfigurable Dataflow Unit (RDU)

Introducing the SN50

Headline here

From chips to racks

Seamlessly achieve high performance

From chips to racks

Solving AI’s data movement problem

Tiered memory supports the largest models

The best speed and throughput in the Goldilocks Zone

Energy-efficient AI inference

From chips to racks

Built for cloud scale

Dataflow architecture

Three-tier memory for efficiency

Related resources

SambaNova Launches First Turnkey AI Inference Solution for Data Centers, Deployable in 90 Days

Why SambaNova's SN40L Chip Is the Best for Inference

SambaNova Launches its AI Platform in AWS Marketplace

Choose the right RDU for your organization

Future-proof your infrastructure

Speed

Energy

Agentic

FAQs

Time to add RDUs to your AI infrastructure