Blog

Zilliz: Powering AI RAG Applications with Vector Embeddings

Written by Rachel Bakke | December 4, 2024

As AI developers strive to build faster, more accurate and contextually relevant Retrieval Augmented Generation (RAG) systems, they face significant challenges in efficiently managing large-scale unstructured data and delivering fast, accurate responses. To overcome these hurdles, SambaNova is working with Zilliz, a cloud-native software company, to showcase the power of combining fast inference with efficient vector databases. This collaboration provides developers with examples on how to build RAG solutions with faster inference, improved resource utilization, and real-time processing capabilities.

Integration Highlights

To make it easier for developers to start building applications, we have provided a few examples to accelerate development:

  1. Enterprise Knowledge Retriever Kit: Milvus is shown within a powerful example of SambaNova’s capabilities as we embedded it into an AI Starter Kit, an example app with UI. This integration allows users to choose Milvus as their vector database option when setting up the kit's RAG workflow locally, providing greater customization and flexibility.
  2. Integration Quickstart Notebooks: Connections have been established between Zilliz resources for Milvus quickstarts to work seamlessly with SambaNova API endpoints.

Quickstart Notebooks

Two concise notebooks are available in the integrations folder, demonstrating how to implement RAG using Zilliz's Milvus and SambaNova's Llama models:

  1. OpenAI Client Approach: Utilizes the HuggingFace Sentence Transformer for embeddings.
  2. LangChain Connector: Uses Instruct Embeddings, offering an alternative approach.

Both notebooks guide users through preparing embeddings, creating collections, and generating responses with context.

Conclusion

SambaNova Cloud and Milvus are pioneering platforms that are working better together and demonstrate the potential of combining cutting-edge vector database technology with faster inference. Developers can get started for free today by signing up for the SambaNova Cloud.