We are excited to announce the addition of Tülu 3 405B, a fine tune of Llama 405B that performs better than DeepSeek V3 on SambaNova Cloud. This powerful open-source model, developed by the Allen Institute for AI (Ai2), represents a significant leap forward in large language model capabilities. Thanks to the SambaNova RDU, we are able to efficiently support this model at over 90tokens/second.
Unparalleled Performance
Tülu 3 405B offers remarkable performance across several key benchmarks, solidifying its position as one of the leading open-source AI models. Here are some of its standout achievements:
- PopQA: It outperformed DeepSeek V3 and GPT-4o models on this set of 14,000 specialized knowledge questions sourced from Wikipedia.
- MMLU: Scored 86.6 on this benchmark, demonstrating strong performance in multi-task language understanding.
- GSM8K: It outperformed every model in its class on this test of math word problems.
As an improved version of Llama 3.1 405B, Tülu 3 enables researchers and developers to distill information in order to create smaller models more quickly and easily than before. And since it comes from Ai2, a US-based non-profit organization, the model is suitable for government and public sector use cases.
About the Tülu Series
The Tülu series, developed by Ai2, represents a groundbreaking progression in open-source AI models. Tülu 3 has emerged as a landmark achievement in artificial intelligence research. Beginning with early versions focused on data collection and supervised fine-tuning, the series has evolved to produce models that have challenged proprietary AI systems with innovative techniques like Direct Preference Optimization (DPO) and Proximal Policy Optimization (PPO).
Tülu 3 offers a comprehensive open-source framework with a sophisticated five-part post-training recipe that enhances core skills, including instruction following, coding, mathematical reasoning, and multilingual capabilities. By providing fully transparent data, code, and training methodologies, Tülu offers performance that rivals proprietary models from tech giants while democratizing access to advanced AI technologies. The latest model, Tülu 3 405B with 405 billion parameters, signals a significant leap forward in the field, promising to reshape AI research by offering researchers and developers a powerful, accessible tool for exploring and advancing artificial intelligence technologies.
Get Started in Minutes on SambaNova Cloud
Developers can access Tülu 3 405B on SambaNova Cloud today through the API, just like any of the other models we support. You can also use Tülu with any of our partners, such as Hugging Face, where we recently integrated SambaNova Cloud with their Inference API. Join our community as well to discuss how to use these models and more with others building with SambaNova.
About SambaNova Cloud
SambaNova Cloud is available as a service for developers to easily integrate the best open-source models with the fastest inference speeds. These speeds are powered by our state-of-the-art AI Chip, the SN40L. Whether you are building AI agents or chatbots, fast inference speeds are a MUST for your end users to enable seamless real-time experiences. SambaNova can deliver speeds that are 10X faster than GPUs. Get started in minutes with the latest and best open-source models, such as Llama 405B, on SambaNova Cloud for free today.