U.S. Department of Energy Accelerates AI With SambaNova Systems
Scientific researchers are exploring ways to combine artificial intelligence (AI) and machine learning (ML) for running complex scientific workloads to gain better performance and efficiency. To solve this problem, the United States Department of Energy’s National Nuclear Security Administration (DOE/NNSA), Lawrence Livermore National Laboratory (LLNL), and Los Alamos National Laboratory (LANL) announced a strategic partnership. The cornerstone of this partnership agreement is multiple installations of SambaNova Systems DataScale™.
SambaNova DataScale is a complete, integrated software and hardware systems platform optimized for dataflow from algorithms to silicon. LLNL is coupling DataScale into its Corona supercomputing system. Initial focus has been on using DataScale for National Ignition Facility applications. Corona is primarily being used for COVID-19 drug discovery and LLNL plans to apply DataScale to this workload.
Improved Performance, Accuracy, and Productivity With SambaNova DataScale
SambaNova DataScale is improving overall performance, accuracy, and productivity for these demanding research institutions.
It’s no surprise, as SambaNova DataScale is designed for both efficient deep-learning inference and training calculations. It features the SambaFlow™ software stack and the world’s first Reconfigurable Dataflow Unit, the Cardinal SN10™ RDU. The system contains eight RDUs—each one capable of supporting multiple simultaneous jobs or working seamlessly together to execute large-scale models.
Ian Karlin is the principal HPC strategist at LLNL. After bringing SambaNova DataScale on-site in September, he has already reported that early tests have shown DataScale to be 5X or better when normalized against GPUs.
Karlin says DataScale was the right choice for LLNL for several reasons; chief among them was the integrated software and hardware systems and the ability to do both training and inference on one platform.
Computer scientist and LLNL Informatics Group Leader, Brian Van Essen explains, “We selected SambaNova for this procurement because one of the key features they have is the ability to do training and inference on small batch sizes. Inference at small scales is key; training on small batches is important for retraining and fine-tuning models. That’s something we’ll be doing.” He also cites “maturity of the programming model and the team’s expertise with the software stack” as a crucial aspect of the LLNL’s two-year engagement with SambaNova.
Over at LANL, the first application targeted for acceleration with DataScale is modeling quantum chemistry with density-functional theory (DFT)-level accuracy. LANL has developed a workflow for building machine learning models of interatomic energies and forces to enable molecular dynamics (MD) simulations with high accuracy in a computationally efficient manner. These ML models are very faithful to DFT reference calculations and enable reactive chemistry from first principles in support of materials science, chemistry, molecular biology, and drug design.
As reported, these calculations currently run on GPU hardware and are showing further promise of acceleration with the SambaNova DataScale system. An ongoing collaboration between SambaNova Systems and LANL scientists suggests the possibility of up to 5X speedup compared to the existing GPU implementation.
Exploring Breakthrough Advances
LLNL researchers are using SambaNova DataScale to continue exploring the combination of high-performance computing (HPC) and AI, an innovative effort LLNL calls “cognitive simulation” (CogSim). Researchers said the two systems working in tandem will enable more streamlined computation and allow them to move applications into this new computing model.
SambaNova DataScale’s ability to run dozens of inference models at once while performing scientific calculations on the Corona system will aid in their quest to use machine learning to accelerate key applications.
According to LLNL researchers, SambaNova DataScale will be used in the small molecule drug design work being applied to COVID-19 at LLNL, as well as to cancer through the ATOM (Accelerating Therapeutics for Opportunities in Medicine) project. Recent work has produced a machine learning model to improve COVID-19 drug design that uses small batch training. This is important for this type of model that converges best at small batches. SambaNova DataScale has the capability for efficient small batch training—a key differentiating feature that sets it apart from GPUs. This work will be integrated into drug design loops that generate new potential compounds that then are evaluated for safety and efficacy using HPC simulations on the Corona system.
LLNL’s COVID-19 machine learning model is a finalist for the Gordon Bell Special Prize for High Performance Computing-Based COVID-19 Research, which will be announced on Nov. 19.
The AI research taking place at labs such as LLNL and LANL is not unique to the public sector. Using similar techniques, forward-thinking enterprises are advancing their own AI initiatives and making significant progress.
Here at SambaNova Systems, we’re excited about the collaboration with DOE/NNSA, LLNL, and LANL. “SambaNova Systems is providing the platform for innovation to enable visionaries to achieve breakthrough advancements in their domains,” says Rodrigo Liang, our co-founder, and CEO.
Our partnership with the U.S. Department of Energy is just one example of how we are enabling this.