Core Concepts

Inference

In one line

The act of running a trained model to get an answer — as opposed to training it.

What does Inference mean?

Inference is what happens every time you send a prompt: the model does forward-pass calculations to produce output. Inference cost dominates the economics of deploying AI.

A real-world example

Each ChatGPT reply is one inference run.

Related terms

Large Language Model (LLM)

A neural network trained on huge text collections to predict the next word — the engine behind ChatGPT, Claude and Gemini.

Training

The expensive process of teaching a model by adjusting its weights on huge amounts of data.

Latency

How long the AI takes to start (or finish) replying.