Blog

Samba-1: A Composition of Experts Model

by Keith Parker

February 28, 2024

SambaNova announces Samba-1, the first trillion-parameter generative AI model that meets the performance, accuracy, scalability, and TCO requirements that enterprises need. Samba-1, powered by SambaNova Suite, can deliver these benefits as the first model built on a Composition of Experts architecture.

Until now, large enterprises and government organizations have been challenged with deploying generative AI at scale. They recognize the incredible potential of this technology to streamline processes across the entire organization, reduce costs, increase productivity, improve supply chains, and much more. However, concerns about cost, complexity, security, data privacy, model ownership, and regulatory compliance have slowed their adoption of generative AI.

Delivered as part of SambaNova Suite, Samba-1 brings enterprise customers all the accuracy and depth of knowledge that only trillion+ parameter models can offer combined with the performance and trainability of smaller models, without any of the drawbacks of either.

Unlike the other trillion parameter models available today, which are built as single, monolithic models, a Composition of Experts (CoE) works by aggregating multiple small “expert” models together into a single large solution. These function as if they were a single large model to deliver all the benefits of trillion parameter models including broad knowledge across a variety of topics, high accuracy, and multimodality.

In fact a CoE model can offer greater knowledge and accuracy for specialized domains than other large models. This is because individual smaller models can be trained for specific domains, such as finance, law, physics, biology, or any other specialized topic and added to the CoE. This brings high accuracy for that specific domain to the model, without the need to perform training on the entire trillion parameter model.

This makes the model much more trainable, scaleable, and flexible than other large models. Training trillion parameter models, such as OpenAI’s GPT-4, has been estimated to cost over $100 million dollars. Clearly, at that cost, re-training the model is not a viable option for most organizations.

Samba-1 can be fine tuned with private customer data to bring internal knowledge such as part numbers, customer account information, and marketing campaigns into the model. This information makes the model significantly more valuable to the organization as it can address issues that models only trained on general data cannot.

The ability to apply an individual expert model to a prompt gives a CoE model low inference costs. With monolithic models, the entire model is loaded into on-chip memory for every prompt. Given the size of these models, this means an expensive and complex infrastructure that can include hundreds of GPUs.

Further, the CoE architecture utilizes a single model which is trained to act as a router to the other models. The router model interprets user prompts and directs the query to the most appropriate model to respond. The use of a router model enables enterprises to use rule and role based access control to ensure that data remains private. This level of access control is not available on large, monolithic models and is an absolute requirement for any enterprise or government.

Samba-1 is the third trillion parameter generative AI model on the market and it is the only one to provide the data privacy, security, high accuracy, the ability to address any use case, and access control that enterprises require. As part of SambaNova Suite, Samba-1 customers enjoy flexibility of deployment, model ownership and transparency at a TCO that is 10x better than other platforms.

Learn more about how SambaNova Suite with the trillion parameter Samba-1 CoE model can transform your organization.

← High-Accuracy AI Models in 9 Languages

Samba-CoE v0.1: Unlocking the Power of Experts →