Generative AI, enabled by large language models (LLMs), has the potential to revolutionize every function within every enterprise. Yet for all the hype, so far LLMs have only hinted at what is possible. To achieve that potential will require even larger models, and larger models will require a system designed to run them.
Today, SambaNova enhanced the SambaNova Suite – the only purpose-built, full stack LLM platform – with its revolutionary fourth-generation chip, the SN40L RDU. The SN40L has a revolutionary Dataflow design making it capable of both dense and sparse computation. Combined with a three tier-memory structure comprising on-chip memory, high-bandwidth memory, and high-capacity memory, the SN40L enables up to 5 trillion parameter models with 256k+ sequence length capability. This will enable higher quality models, faster training and inference on a single platform, and an overall lower total cost of ownership.
The new chip is just one enhancement to the SambaNova Suite, which enables it to solve some of the biggest challenges that enterprises face when deploying generative AI at scale. Enterprises need the power of ever-larger models, along with access control and enhanced security, combined with the ability to quickly and easily adapt them with private organizational data.
SambaNova addresses this by using a modular approach – building a large model through a composition of smaller expert models which can be continually improved, adapted, added-to. Using SambaNova hardware and software and its unique memory advantages, we can deliver massive models (up to 5 trillion parameters) without sacrificing inference throughput or security.
Included with SambaNova Suite are some of the largest and most powerful open-source models available today, including Llama-2, BLOOM 176B, and more. Using SambaNova Suite, these massive models can be adapted using customer data for greater accuracy. Once the models have been fine-tuned, the customer then owns that model in perpetuity.
When customers choose the SambaNova Suite they get the benefits of:
- Inference-optimized systems with hierarchical memory stack for high capacity and high speed.
- Llama-2 variants (7B, 70B): state-of-the-art open source language models enabling customers to adapt, expand and run the best LLM models available, while retaining ownership of these models.
- BLOOM 176B: the most accurate multilingual foundation model in the open source community, enabling customers to solve more problems with a wide variety of languages, while also being able to extend the model to support new, low resource languages.
- Larger memory that unlocks true multimodal capabilities from LLMs, enabling companies to easily search, analyze and generate data in these modalities.
- Lower total cost of ownership (TCO) for AI models due to greater efficiency in running LLM inference.
- A new embeddings model for vector-based retrieval augmented generation enabling customers to embed their documents into vector embeddings, which can be retrieved during the Q&A process and NOT result in hallucinations. The LLM then takes the results to analyze, extract or summarize the information.
- A world-leading automated speech recognition model to transcribe and analyze voice data.
SambaNova Suite is the first purpose-built LLM platform and will enable enterprises to finally take advantage of the true potential of AI.
Click here to learn more about the SambaNova Suite and how it can bring the promise of generative AI to your organization, today.