OpenRouter Uses SambaCloud to Deliver High Speed LLM Performance

Challenge:

OpenRouter is the largest marketplace for LLM inference. They have hundreds of LLMs available across dozens of different cloud providers. In cases where developers and enterprises need inference for their product or use case, they build a single integration to OpenRouter and they can access all the different models that they may need through a single interface and a single billing relationship.

Some developers require consistent, high speed inference for their use cases. In these cases total generation times are dominated by the overall throughput and not just time to first token.

As a developer-first platform, OpenRouter found a wide variety of different use cases that need the throughput that SambaNova offers. In particular, this high performance is critical to interactive uses cases where a user may be in a chat mode and an instant response is required.

Solution:

OpenRouter offers a variety of open source models via SambaCloud. As a high-speed inference provider, SambaNova hits the sweet spot of price and performance for a number of OpenRouter customers who require reliable inference and high throughput.

OpenRouter uses SambaCloud. Powered by SambaNova RDU, the SN40L, SambaCloud delivers the fastest inference on the largest and best open source models., including the fastest performance available for DeepSeek R1 671B. Meta Llama 4 Maverick, and more.

News

SambaNova Expands Deployment with SoftBank Corp. to Offer Fast AI Inference Across APAC

March 5, 2025

Blog

Qwen3 Is Here - Now Live on SambaNova Cloud

May 2, 2025

Blog

SambaNova Partners with Meta to Deliver Lightning Fast Inference on Llama 4

April 7, 2025

OpenRouter Uses SambaCloud to Deliver High-Speed LLM Performance

Challenge:

Solution:

LLMs available across dozens of providers

Average number of models companies use

Providers on their platform

“SambaNova, as a high throughput provider, has an important role in the market for use
cases where they care about the total time to last token for larger prompts."

— Chris Clark, OpenRouter Chief Operating Officer

Related resources

SambaNova Expands Deployment with SoftBank Corp. to Offer Fast AI Inference Across APAC

Qwen3 Is Here - Now Live on SambaNova Cloud

SambaNova Partners with Meta to Deliver Lightning Fast Inference on Llama 4

Time to start building

OpenRouter Uses SambaCloud to Deliver High-Speed LLM Performance

Challenge:

Solution:

LLMs available across dozens of providers

Average number of models companies use

Providers on their platform

“SambaNova, as a high throughput provider, has an important role in the market for use cases where they care about the total time to last token for larger prompts."

— Chris Clark, OpenRouter Chief Operating Officer

Related resources

SambaNova Expands Deployment with SoftBank Corp. to Offer Fast AI Inference Across APAC

Qwen3 Is Here - Now Live on SambaNova Cloud

SambaNova Partners with Meta to Deliver Lightning Fast Inference on Llama 4

Time to start building

“SambaNova, as a high throughput provider, has an important role in the market for use
cases where they care about the total time to last token for larger prompts."