Products
Developers
About

Parasail uses SambaNova Cloud to deliver extreme performance

parasail-logo-blue

AI Deployment Network delivers the fastest token speeds available for real-time low latency processing

Challenge:

Parasail aggregates AI infrastructure resources to process billions of tokens per day and provide their customers with the right combination of cost, scalability, and performance to meet the specific needs of their applications. With global customers ranging from fast growing startups to large enterprises, Parasail delivers an AI deployment network with a highly diverse package of resources designed to meet any requirements.

While broadly available GPU-based resources met the needs of customer workloads that are not price sensitive, some customers had a need for the fastest token speeds for real-time, low latency processing.

Solution:

Parasail integrated the SambaNova Cloud into their environment. Powered by the SambaNova RDU, the SN40L, SambaNova Cloud delivers the fastest inference on the largest and best open source models.

With a global network of resources, the SambaNova Cloud seamlessly connects into the Parasail AI Deployment Network. Their customers can select SambaNova from the list of available solutions and immediately begin taking advantage of lighting fast inference across a range of models including DeepSeek R1 671B, DeepSeek V3, Llama 3.1 405B, and many more.

 
 

In the above video, Parasail has an embedding of hundreds of millions of scientific papers. When a user asks a question, there are many steps to delivering a high quality answer. In this example, they pull out the relevant facts and additional content, then run a full scale LLM on the results before delivering the final result. While it takes some time for the Nvidia GPUs to deliver a result, the response from SambaNova is immediate.