Core Concepts

Training

In one line

The expensive process of teaching a model by adjusting its weights on huge amounts of data.

What does Training mean?

Training uses gradient descent over trillions of tokens to shape a model's parameters. Pretraining creates a foundation model; fine-tuning adapts one to a task.

A real-world example

Meta reportedly spent ~$500m of GPU time training Llama 3.

Related terms