In today's fast-paced business landscape, enterprises need more than just the latest AI model to solve their biggest challenges. They need a platform optimized for speed, efficiency, and accuracy. With our platform and many expert models fine-tuned on their data, enterprises can improve customer satisfaction and employee experience. According to a recent Gartner survey, these are the top two AI use cases on the minds of CEOs.
Just last week, Meta released its biggest open-source model to date, Llama 3.1 405B, comparable in quality to OpenAI’s GPT-4o.
Today, we’ve set a world performance record of 114 tokens per second on this model, independently verified by Artificial Analysis. This was accomplished on a single 16-socket node and delivered with full 16-bit precision. No other platform has achieved this speed with this accuracy to date. It’s a testament to SambaNova's commitment to solving the most pressing AI problems facing enterprises today.
"Artificial Analysis has independently benchmarked SambaNova as serving Meta's Llama 3.1 Instruct 405B model at 114 tokens per second, the fastest of any provider we have benchmarked and over 4 times faster than the median provider. Llama 3.1 delivers leading quality but is large at 405B parameters and is therefore slow on GPU systems. SambaNova's leading speed, delivered on its custom RDU chips, lessens this trade off between quality, size and speed and supports Llama 3.1 405B being used in more speed sensitive use-cases, such as consumer applications, customer support, AI Agents, and many others. " - George Cameron, Co-Founder, Artificial Analysis
What does this mean? Enterprises can now deploy their own private GPT on our platform with SambaNova Suite. And thanks to our fourth-generation RDU chip, the SN40L, they can achieve real-time results that were previously impossible with slower, less efficient solutions.
The speed of our platform unlocks the ability to chain multiple prompts together in real-time with GPT quality, unlike any other platform. This now enables a new set of enterprise use cases that we are seeing deployed today, including:
To see the speed yourself, try the demo of 405B at https://sambanova.ai/
Developers interested in building enterprise use cases should reach out for early access to our APIs to start building their enterprise GPT.