Retrieval Augmented Generation in SambaNova Suite
The greatest value of large language models (LLMs) to enterprise organizations lies in their ability to correctly answer user prompts. In these use cases, it is especially important that the model responds to prompts with the most accurate and up-to-date information and does not hallucinate. That is why we are proud to announce that we have added Retrieval Augmented Generation (RAG) capabilities to SambaNova Suite, as part of a multi-step model process around pre-training, fine-tuning and then retrieval.
When a user queries an LLM through a prompt, the LLM then goes to the data that it has been trained on to obtain a response. The accuracy of that response will depend on the quality and relevance of the data that the model has to source a reply. If the model has been trained on public data, then it can only respond based upon the data that was available at the time it was trained. If that information is now outdated, the model will give an outdated answer. If the model cannot find a correct answer, then it may hallucinate. This is why it is so important that the model be able to access data that is both accurate and current.
The challenge then becomes that training large monolithic models on big data sets can be extraordinarily costly and time consuming. As a result, if you are using a closed model there is a reasonable chance that it was trained on data that is either outdated or that was not available to respond to a current question.
With the addition of RAG support, SambaNova can help mitigate this as the model can now “retrieve” new information through a process of searching and analyzing external data sets, such as a knowledgebase. This enables the model to use information in that knowledgebase to build a response to user prompts, even though it has not been directly trained on that data. This can greatly increase the accuracy of the response, which will ultimately drive greater value to the organization.
By adding RAG support, SambaNova Suite continues to deliver the performance, accuracy, and flexibility to power the largest models.
For more information on SambaNova RAGs support, read this blog.
Los Alamos National Laboratory expands partnership with SambaNova
Los Alamos National Laboratory recently announced that they are expanding their existing partnership with SambaNova Systems to facilitate their generative AI and large language model (LLM) capabilities.
A federally funded research and development center, Los Alamos National Laboratory aligns their plan to priorities set by the Department of Energy’s National Nuclear Security Administration (DOE NNSA) and key national strategy guidance documents. They work across all of the DOE’s missions: national security, science, energy, and environmental management.
As part of their on-going mission, Los Alamos National Laboratory has a new focus on deploying generative AI LLMs. To achieve this goal as quickly as possible, they have chosen the SambaNova Suite as the platform to deploy LLMs.
“At the dawn of the exascale supercomputing era, we are increasingly relying on AI to be part of the ASC computing ecosystem to support our mission objectives now and in the coming years,” said NNSA ASC program director Thuc Hoang. “We are pleased to be scaling up our existing deployments of SambaNova Systems to advance generative AI and large language model technologies to contribute to the ASC program.”
This is significant, as despite having the scientific and computing resources of a DOE lab, combined with their long history of technological achievements and innovation, they have chosen to use the SambaNova platform instead of building their own.
SambaNova Suite, the only fully integrated platform from chip to models that is purpose built for generative AI, offers Los Alamos National Laboratory a fast and easy way to apply generative AI LLMs to their mission in record time.
Some of the largest and latest open source models available come pre-trained as part of SambaNova Suite. Customers can then fine tune those models on their own data for greater accuracy. Once the models have been adapted by the customer, they then own those models in perpetuity. This gives them the ability to know what data their models have been trained on, the training weights of those models, and the ability to maintain control of those models. This gives them a level of model explainability and security that closed model providers do not offer.
SambaNova Suite has the flexibility to be deployed as a cloud service or as an on-premises solution for customers that need the security of having the entire system behind their firewall.
In addition to deploying SambaNova Suite to advance their generative AI and LLM capabilities, Los Alamos National Laboratory is also scaling up their existing SambaNova DataScale system to accelerate their national security science, technology, and engineering projects.
The combination of SambaNova Suite and SambaNova DataScale deliver Los Alamos the speed, flexibility, and performance that they need to meet their mission objectives.
Welcome to the era of pervasive AI
While it may feel like we have been moving far and fast with AI, in reality we have been only exploring the foothills of this transformative technology. Now, we are on the edge of an era of Pervasive AI, when AI will become a core technology to everything, much like the Internet is today. In the 1990s, we saw glimmers of the internet’s potential, and now it underpins everything – how we work, consume entertainment, order goods and services, and every other aspect of business and society. This is the kind of impact that AI is about to have – only faster.
To date, organizations have used hundreds or even thousands of small models, each trained to perform a specific task. These small models are difficult to manage, expensive to maintain, and often do not work with each other. Large language models (LLMs) have made it possible for enterprises to use fewer models, each capable of replacing dozens of smaller models to reduce cost and complexity while also increasing their ability to generate valuable content. This capability is enabled by the sheer size of the models, which often contain tens of billions of parameters.
As we enter the era of pervasive AI, the accuracy demanded by enterprise tasks means that LLMs will need to have trillions of parameters. These models will be composed of hundreds of smaller models, all working together as a Composition of Experts model, which provides a high degree of modularity and extensibility. This will be the next generation of model architecture.
Today, a large language model will be capable of accurately summarizing a document such as a contract or reading and interpreting an image such as a medical scan, but they are limited in that as they become better in one function, such as finance, they lose capability in others, such as legal.
When models reach a trillion parameters, these limitations can be reduced or eliminated. The upside is that only a few of these very large models will be required for even the largest enterprise and they will be able to connect functions, such as finance, legal, operations, marketing, people ops, and more across the organization for unprecedented levels of efficiency.
AI will be multimodal, capable of ingesting and analyzing a broad range of data sources, including voice recordings, images, and text among others. It will be able to do this across a range of languages, a critical capacity for global organizations.
But delivering this capacity means that new capabilities are needed. Powering very large models across an enterprise organization requires a hardware platform that has been designed for AI at scale, so that the models can be run on a small, optimized footprint instead of ever-larger clusters of GPUs. That platform needs fully-integrated software that can optimize the models to run on the hardware and to manage and maintain the models. The models themselves will have to be open source, both to provide organizations with the explainability they will need to serve regulated industries, as well as to ensure that every company has the right to the IP they build in their own models. Finally, the platform will need to connect AI models to enterprise workflows in a seamless manner. Ultimately, this will require an integrated hardware-software, full stack platform, purpose-built for AI.
Today, there is such a system. The SambaNova Suite, the only full-stack, fully integrated platform, purpose-built for generative AI, and optimized for enterprise and government organizations. Available on-premises or in the cloud, SambaNova Suite is powered by the revolutionary SN40L chip and delivers state-of-the-art Composition of Expert models that can be adapted with customer data for greater accuracy.
Welcome to the era of pervasive AI!
Delivering on the promise of pervasive AI: SambaNova Suite, powered by the SN40L
Generative AI, enabled by large language models (LLMs), has the potential to revolutionize every function within every enterprise. Yet for all the hype, so far LLMs have only hinted at what is possible. To achieve that potential will require even larger models, and larger models will require a system designed to run them.
Today, SambaNova enhanced the SambaNova Suite – the only purpose-built, full stack LLM platform – with its revolutionary fourth-generation chip, the SN40L RDU. The SN40L has a revolutionary Dataflow design making it capable of both dense and sparse computation. Combined with a three tier-memory structure comprising on-chip memory, high-bandwidth memory, and high-capacity memory, the SN40L enables up to 5 trillion parameter models with 256k+ sequence length capability. This will enable higher quality models, faster training and inference on a single platform, and an overall lower total cost of ownership.
The new chip is just one enhancement to the SambaNova Suite, which enables it to solve some of the biggest challenges that enterprises face when deploying generative AI at scale. Enterprises need the power of ever-larger models, along with access control and enhanced security, combined with the ability to quickly and easily adapt them with private organizational data.
SambaNova addresses this by using a modular approach – building a large model through a composition of smaller expert models which can be continually improved, adapted, added-to. Using SambaNova hardware and software and its unique memory advantages, we can deliver massive models (up to 5 trillion parameters) without sacrificing inference throughput or security.
Included with SambaNova Suite are some of the largest and most powerful open-source models available today, including Llama-2, BLOOM 176B, and more. Using SambaNova Suite, these massive models can be adapted using customer data for greater accuracy. Once the models have been fine-tuned, the customer then owns that model in perpetuity.
When customers choose the SambaNova Suite they get the benefits of:
- Inference-optimized systems with hierarchical memory stack for high capacity and high speed.
- Llama-2 variants (7B, 70B): state-of-the-art open source language models enabling customers to adapt, expand and run the best LLM models available, while retaining ownership of these models.
- BLOOM 176B: the most accurate multilingual foundation model in the open source community, enabling customers to solve more problems with a wide variety of languages, while also being able to extend the model to support new, low resource languages.
- Larger memory that unlocks true multimodal capabilities from LLMs, enabling companies to easily search, analyze and generate data in these modalities.
- Lower total cost of ownership (TCO) for AI models due to greater efficiency in running LLM inference.
- A new embeddings model for vector-based retrieval augmented generation enabling customers to embed their documents into vector embeddings, which can be retrieved during the Q&A process and NOT result in hallucinations. The LLM then takes the results to analyze, extract or summarize the information.
- A world-leading automated speech recognition model to transcribe and analyze voice data.
SambaNova Suite is the first purpose-built LLM platform and will enable enterprises to finally take advantage of the true potential of AI.
Click here to learn more about the SambaNova Suite and how it can bring the promise of generative AI to your organization, today.
Accelerating HPC Simulations and AI with SambaNova
Accelerating the integration of HPC simulations and AI with SambaNova Systems
SambaNova will be exhibiting at ISC High Performance 2023, May 21-25 in Hamburg, Germany. There we will showcase how generative AI is already being used to accelerate scientific discovery, including the progress we have been making with major research centres around the world. We continue to advance our solutions and capabilities, alongside more and more customer success from integrating HPC simulations and AI.
Below are a couple of links to recent news, as we have scaled up our deployment at Argonne National Laboratory and are now installed in the RIKEN Fugaku system.
We can show significant advantages for deploying LLMs with large sequence lengths, also for high resolution imaging results, including greater than 5123, without the need for complex model partitioning and parallelisation. Our architecture and software stack has the potential to revolutionise scientific research across a variety of domains, enabling a deeper understanding of complex processes, more accurate predictions, and accelerated discovery for our customers.
Come and meet any of our engineers at Booth G703 on the exhibition floor. Marshall Choy, SVP of Product at SambaNova Systems, will also be speaking on Generative AI Driven Scientific Discovery as part of the HPC Forum session on Tuesday, May 23rd from 1:20 PM to 1:40 PM in Hall H, Booth K1001. If you would like to book time to meet with any of our team during the conference, come by the booth and we will make that happen, or just reply to this note.
You could also be in with a chance to WIN a Lego® Robot Inventor Set valued at €359.99! Visit the SambaNova Booth between Monday, May 22nd at 3:00 PM and Tuesday, May 23rd at 3:00 PM CET to enter. The drawing will take place on Tuesday, May 23rd at 3:00 PM CET. Must be present to win!
Accenture and SambaNova: Delivering Generative AI to the Enterprise
Accenture and SambaNova are now delivering powerful, generative AI solutions that have been optimized for enterprise and government organizations. Capable of analyzing massively complex documents, understanding volumes of data of varying types, creating net new content, and more, it is hard to overstate the potential impact of generative AI. The solutions from Accenture and SambaNova are designed to meet the demanding needs of these organizations in ways that consumer grade generative AI does not.
Overcoming the Challenges with Consumer Generative AI
While consumer grade generative AI, such as ChatGPT, has captured the public’s and the media’s imagination, business leaders are struggling with how to take advantage of the massive opportunities this transformative technology presents.
The challenges with incorporating this technology into business applications are significant. Some of these include:
- The models are trained on generic, internet data
- Models refined using an organization’s data may become available to competitors
- Models fine tuned with organizational data remain the property of the vendor
- Governance and auditability may not be possible
Building solutions optimized for the enterprise
Recognizing these challenges, Accenture and SambaNova are partnering to co-develop generative AI solutions, optimized for enterprise and government organizations, that unlock the potential of this technology, while meeting the demanding requirements of these organizations. The partnership between the two companies has resulted in the development of solutions that drive line of business efficiency and productivity, with security features that protect the privacy and integrity of the data used by generative AI solutions. It can automate user and employee experiences, streamline operations, improve efficiency, and unlock insights trapped in unstructured data.
Bringing generative AI to the business
The Accenture and SambaNova solutions seamlessly integrate with existing workflows through simple APIs, so there is no need to replace existing tools or processes. Examples of these solutions include Contact Center Intelligence, which enables enterprises to assist agents with customer calls, discover information about customer interactions, and better meet compliance requirements. Document Intelligence extracts information from massive volumes of documents, derives insights from extremely complex documents, and more.
Organizational data, fine-tuning, and governance
These solutions utilize models that are pre-trained with domain specific data on the latest open source models. This means that in addition to always having the latest and most powerful models, organizations are able to take advantage of models pre-trained, out-of-the box.
Models are then further adapted using an organization’s own data for even higher accuracy. Once a model has been trained using internal data, that model becomes a critical asset and is the property of the organization in perpetuity.
These solutions deliver significant benefits to the enterprise, including:
Governance: Customers control all the layers of the model, not just the last layer.
Auditability: Get full visibility on the model weights and datasets it was trained on.
Control: Export the model at any point and maintain ownership of the model.
While consumer tools, such as ChatGPT, have captured the attention of the media, Accenture and SambaNova are delivering systems that are optimized for and can meet the specific requirements of banks and other large enterprises. These solutions deliver the data governance, auditability, and control that these types of organizations demand, on a fully integrated platform, that is available as an on-premises solution or delivered anywhere as a cloud service.
To learn more about this transformative technology, click here.
Solving Enterprise Data Privacy and Security Concerns with Generative AI
Solving Enterprise Data Privacy and Security Concerns with Generative AI
As we near the 6 month mark of generative AI dominating the headlines, the conversation has quickly shifted from amazement to more practical considerations and risks. In response to leaks of confidential and private data, one of the biggest topics of discussion related to generative AI has quickly become data privacy and security, particularly for enterprises and government organizations.
One of the biggest security and privacy risks is caused by what is known as a ‘shared model backbone’, referring to when a generative AI tool uses a single model across all of its users and customers. The implication of this is that any data that is used to interact with a generative AI tool, such as ChatGPT, becomes part of the model, improving it over time. However, it also means that this data can be accessed by other users. Unsurprisingly, for enterprises and government organizations this poses serious security and privacy concerns
In one high profile example, employees at Samsung inadvertently leaked confidential information by sharing meeting minutes and source code in a ChatGPT prompt.
In another example, it was revealed that a bug resulted in leaked sensitive information about ChatGPT user data.
Overcoming these issues requires a fundamentally different approach to enable generative AI for enterprises and government organizations. Generative AI must be deployed within a customer’s firewall, and provide the organization with its own ‘dedicated model backbone’. This means that the organization has its own unique generative AI model that is not shared with any other customers, and can use its own data to adapt and interact with the model without risk of that information being leaked. It also enables these organizations to retain ownership of the models built in this way.
Click here to learn more about how SambaNova Suite delivers generative AI optimized for the enterprise.
Three takeaways from SambaNova’s conversation on generative AI with Ed Abbo, President and Chief Technologist of C3 AI
Last week, SambaNova’s Co-founder and CEO Rodrigo Liang caught up with Ed Abbo, President and Chief Technologist of C3 AI to discuss how generative AI is transforming the enterprise, including how generative AI delivers a new human computer interface for enterprise AI, the importance of verifiability, and why enterprises will require generative AI solutions to be securely deployed within their own environment.
Below are three of my favorite insights from Rodrigo’s conversation with Ed. You can also watch the full video above.
Generative AI has fundamentally changed the human computer interface for enterprise AI
Historically, enterprise AI tools have had different interfaces that required users to learn and be trained on these systems, often requiring learning complex coding languages. Generative AI removes the need for training on these systems by enabling users to both ask a question and receive an answer in natural language. This means that business users can simply ask a question through a ‘search bar’ experience, such as “Where will my supply chain break down?” and receive a detailed answer based on the inputs across different enterprise systems.
Enterprises need verifiability
One of the main differences between consumer and enterprise generative AI is the importance of verifiability. In the enterprise, generative AI not only needs to be highly accurate, but when retrieving an answer to a question it needs to provide references to where that answer came from. This is particularly important in complex enterprise technology environments with hundreds or even thousands of different tools and systems.
Generative AI needs to be deployed securely in an enterprise’s own environment
Just as with any new technology, generative AI needs to meet information security requirements for enterprises and government organizations. That means generative AI models and infrastructure need to be deployed and managed within a company’s own environment.
Be sure to check out our other blog post
Introducing SambaNova Suite for Generative AI
There is no question that generative AI, and the impressive potential it represents, is 2023’s hottest tech trend. The buzz around generative AI has consumed everything from technology press, to mainstream media outlets, and even late night talk shows. Much of this virality has resulted from the public’s ability to engage with consumer generative AI products, such as ChatGPT, and experience first hand some of the most exciting, creative, and surprising outputs from these tools.
Despite all of this buzz about consumer generative AI, the reality is that consumer use cases for generative AI are vastly different than for enterprises and government organizations. As often happens with exciting consumer technologies, enterprise and government leaders have already started asking “How can we start applying generative AI to solve our business and operational challenges?”
Introducing SambaNova Suite for generative AI
It is with this question in mind that I am excited to announce the SambaNova Suite for generative AI, the first generative AI platform specifically optimized for enterprises and government organizations.
SambaNova Suite is a collection of the highest accuracy generative AI models which can be deployed directly in a customer’s own environment, and can be further adapted using their data for even greater accuracy
You can learn more about SambaNova Suite by watching the overview below:
Our thesis for SambaNova Suite is based on four key principles.
- Enterprises and government organizations require the highest accuracy: SambaNova Suite delivers a collection of the highest accuracy generative AI models, including both state-of-the-art open source models, as well as models that have been pre-trained by SambaNova including GPT and Bloom.
- “Your data, your models”: SambaNova Suite empowers customers to optimize these models with their own data to further increase accuracy, while also allowing them to retain ownership of models that have been adapted with their data.
- An open approach to generative AI: SambaNova Suite has been developed and will continue to evolve as an open platform integrating innovations from ecosystem partners at every layer of the stack, including model development, data, and enterprise integration.
- Deploy Anywhere: SambaNova Suite is a full stack AI offering which can be deployed on-premises or in the cloud, so no data ever needs to leave the customer’s environment. Further, unlike consumer generative AI cloud offerings, SambaNova Suite is delivered on a dedicated model backbone for every customer.
With SambaNova Suite, we are empowering enterprises and government organizations to take advantage of the full potential of generative AI to solve their biggest business and operational challenges, while delivering the flexibility, privacy, and security required of modern technologies and tools. At SambaNova, we think that the generative AI revolution is just beginning, and we are excited to work together with our customers and partners to discover the full exciting potential of this transformational technology.
To see for yourself how SambaNova Suite is empowering enterprises and government organizations to take advantage of the full potential of generative AI to solve their biggest business and operational challenges, watch these product demo videos.
You can learn more about SambaNova Suite for generative AI and our other exciting announcements by visiting our official launch announcement page.