Blog

The Next Generation of Large Models

by Keith Parker

April 23, 2024

Generative AI presents a transformational opportunity for enterprise organizations. It can streamline processes across the entire organization, reduce costs, increase productivity, improve supply chains, and much more.

Taking advantage of this opportunity requires that organizations utilize larger models than have been practical to this point. The larger the AI model, the more accurate, meaningful, and functional the results that it provides will be, so running the largest models is critical. The challenge then becomes choosing the right model and platform to power continuously expanding, very large models.

This will be the first in a two part series of blogs that will discuss why large models are important and how to choose the right one for your organization.

The Generative AI opportunity

The opportunity that generative AI presents to enterprises is immense. Imagine a single AI system that understands all of a company's products and works with the sales team to better understand customer needs to anticipate what they will need and when. A system that can write the contract, in the local language and in accordance with all local and governing laws. Once the order is placed, the system then helps manage the supply chain to ensure timely delivery of the order. If the customer inquires about their account, the system will be aware of all of their transactions and be able to help solve the customer issue quickly, unlike legacy interactive systems, which often do little more than frustrate customers.

Only recently this was the realm of science fiction. Now, the era of pervasive AI is about to make all of this a reality. Generative AI will touch every aspect of business, from sales and marketing to engineering and product development, as well as legal, finance, supply chain management, customer service, and more. There is no area that will be untouched by this transformative technology.

Turning this into reality will require models of unprecedented size, so that they can be trained in every aspect of the business. Serving each area of the organization will require models that are trained on each of these discrete functions and then fine tuned on the enterprise's private data.

The trend to smaller models

In an effort to take advantage of the opportunity AI affords, organizations have been adopting smaller models, often in the range of a few billion to tens of billions of parameters. These smaller models have enabled organizations to begin incorporating AI into their workflows and drive greater efficiencies.

Despite the acknowledged need to move to the largest models possible, organizations have been adopting these smaller models for several reasons. First, historically they could not effectively run larger models without massive investments in infrastructure and personnel. While it strained legacy technology to its limits, these smaller models could be trained and fine-tuned on company data for greater domain specific accuracy. This is no longer practical with the very large models that are now available today. Further, from a cost perspective, running larger models has not been a practical option for most organizations. Simply put, using the largest AI models has been too time consuming and expensive for all but the largest organizations, even with the massive opportunity that AI offers.

Even for those rare organizations that are willing to expend the resources necessary to reap the benefits of large AI models, supply chain issues have made acquiring the necessary resources difficult, if not impossible.These are some of the reasons why companies have been forced to run smaller models in an effort to reap the benefits of AI.

The small model challenge

While organizations have been able to effectively run smaller models, these models have significant limitations. In AI, there has been a long term trend to larger models. This is because the bigger the model, the more it can do and the more accurate it is. Generative AI only became possible once models increased past a certain point. To serve multiple functions within an enterprise, such finance, legal, marketing, engineering, and more with the same model, even larger models will be required.

Another drawback of smaller models is that as they become further trained in one area, such as finance, they become worse in other areas, such as legal or engineering. As a result each subject matter area requires its own model, leading to inefficiencies, complexity, higher cost, and lower returns.

Application specific smaller models are still useful when the model is only required to perform one specific function very well, such as focusing on a particular area of scientific research. But for enterprise organizations that need multiple functions to be served across the organization, bigger models are required.

Very large models

As a rule, the larger the model the better the results. Larger models offer better accuracy, greater capabilities, and more flexibility than smaller models. However, not all very large models are the same. Simply making one very large model has its own set of drawbacks. A single very large model is complex, costly, and time consuming to train, and the ability to continuously train is a vital component to having models that deliver value. The most valuable data any organization has is its most recent data. Information such as customer orders, market trends, and social activity that impacts the organization from the last two days is incredibly valuable. That same information from a year ago is less important. Being able to continuously train on the latest data, so that the model is up to date on that recent information, is simply not practical with a single massive model.

Smaller models are limited in the areas they can be trained on and do not offer the accuracy or flexibility that large organizations require. Very large models deliver greater accuracy and the ability to perform cross functional operations, but cannot be trained on the latest data affordably or within a reasonable time frame.

The answer is to create a best of both worlds solution that combines the trainability and manageability of smaller models with the incredible capabilities of large models. This can be done by aggregating multiple smaller models into a single large model.

These smaller individual models, ranging from just a few up to hundreds of models or more, could each be trained in a particular area of expertise. When an individual model needs to be further trained, it can be done so quickly and cost effectively, without impacting the greater model. All the benefits of large models, including accuracy, the flexibility to address multiple functions, and multimodality can be addressed with this solution, without all the drawbacks. Models can be grown to virtually any size, to meet the needs of even the largest use cases, simply by adding more small models.

This type of larger model can be constructed in two ways. The first is what is often referred to as a Mixture of Experts model, where the experts are not defined by the user, but emerge as a function of training the model. This method allows the creation of massive models using a combination of smaller ones, but does so at the cost of control and manageability which makes it unsuitable for large enterprises.

The Composition of Experts solution

A model using a Composition of Experts methodology also allows multiple experts to be combined together into a single model of any size. The experts are predefined and one of those models is trained to route requests to the appropriate smaller models.

One of the primary advantages of a Composition of Experts over a Mixture of Experts model is that rules can be put into place to control access. This means that, for example, someone in marketing cannot access sensitive information that is controlled by finance. The ability to control access to sensitive data is a vital component of any enterprise deployment. A Composition of Experts model is the right choice for any enterprise that wants to take advantage of the massive benefits of very large AI models, while maintaining security and control of their data.

SambaNova Suite

SambaNova Suite is a full stack platform, from chip to model and purpose-built for generative AI. Available on-premises or in the cloud, it comes with the latest open source models, which can be delivered as a Composition of Experts model of up to 5 trillion parameters, giving customers all the advantages of the largest open source models, along with the performance and efficiency that they need.

This is one in a series of blog posts on what it means for generative AI to be enterprise-grade. Read the entire series:
Enterprise-grade AI | The Next Generation of Large Models | Model Ownership

← Samba-CoE v0.3: The Power of Routing ML Models at Scale

Tokens Per Second is Not All You Need →