Models 101

What Is a Model?

SambaNova | 2 mins

Description

Discover what an AI model actually is: billions of numbers learned through training, and why running that model, called inference, is the real engineering challenge.

Additional Resources

Different Types of AI

SambaNova | 1 min

Description

Get an overview of the main types of AI systems in use today: predictive models, large language models, vision models, and decision-making systems, and learn what each one is designed to do.

Additional Resources

Blog: Generative AI terms
Blog: What is an AI chip

What Is an LLM?

SambaNova | 3 mins

Description

Find out how large language models really work: breaking text into tokens, predicting one token at a time, and using transformer attention to decide what comes next.

Related Resources

What Is a Dataflow Graph?

SambaNova | 3 mins

Description

Explore the dataflow graph, the set of nodes and dependencies that maps every calculation an LLM runs, and learn why how that graph executes decides how fast you get an answer.

Related Resources

Blog: Why dataflow matters
Blog: Solving AI's infrastructure crisis with dataflow
Product: Dataflow Architecture

Prefill vs. Decode

SambaNova | 4 mins

Description

Learn how inference splits into two phases, prefill and decode, why one is compute bound and the other memory bound, and how the KV cache makes memory the critical bottleneck.

Related Resources

Blog: Why dataflow matters
Blog: Why agentic inference needs hybrid hardware
Product: RDU

Understanding Prefill & Decode for Disaggregated Inference

Varun Krishna | 4 mins

Description

Learn how large language models process inputs, what the prefill and decode stages involve, and why disaggregated inference uses different hardware for each to maximize throughput and efficiency.

Additional Resources

Blog: Why agentic inference needs hybrid hardware
Blog: Why dataflow matters
Product: RDU

← Previous video

Next video →

LESSONS

2 mins
1 min
3 mins
3 mins
4 mins
4 mins

← Back to main page

Models 101

What Is a Model?

Different Types of AI

What Is an LLM?

What Is a Dataflow Graph?

Prefill vs. Decode

Understanding Prefill & Decode for Disaggregated Inference

Ready for fast, scalable inference?