OpenRouter uses SambaCloud to deliver high speed LLM performance
Developer focused company meets the performance demands of real-time use cases

Challenge:
OpenRouter is the largest marketplace for LLM inference. They have hundreds of LLMs available across dozens of different cloud providers. In cases where developers and enterprises need inference for their product or use case, they build a single integration to OpenRouter and they can access all the different models that they may need through a single interface and a single billing relationship.
Some developers require consistent, high speed inference for their use cases. In these cases total generation times are dominated by the overall throughput and not just time to first token.
As a developer-first platform, OpenRouter found a wide variety of different use cases that need the throughput that SambaNova offers. In particular, this high performance is critical to interactive uses cases where a user may be in a chat mode and an instant response is required.
Solution:
OpenRouter offers a variety of open source models via SambaCloud. As a high-speed inference provider, SambaNova hits the sweet spot of price and performance for a number of OpenRouter customers who require reliable inference and high throughput.
OpenRouter uses SambaCloud. Powered by SambaNova RDU, the SN40L, SambaCloud delivers the fastest inference on the largest and best open source models., including the fastest performance available for DeepSeek R1 671B. Meta Llama 4 Maverick, and more.
LLMs available across dozens of providers
Average number of models companies use
Providers on their platform
“SambaNova, as a high throughput provider, has an important role in the market for use
cases where they care about the total time to last token for larger prompts."
— Chris Clark, OpenRouter Chief Operating Officer
In the above video, Chris Clark, Chief Operating Officer at OpenRouter discusses how OpenRouter leverages the performance of SambaNova to meet customer demands.
Related resources
