Core Concepts
Inference
In one line
The act of running a trained model to get an answer — as opposed to training it.
What does Inference mean?
Inference is what happens every time you send a prompt: the model does forward-pass calculations to produce output. Inference cost dominates the economics of deploying AI.
A real-world example
Each ChatGPT reply is one inference run.
Related terms
Large Language Model (LLM)
A neural network trained on huge text collections to predict the next word — the engine behind ChatGPT, Claude and Gemini.
Training
The expensive process of teaching a model by adjusting its weights on huge amounts of data.
Latency
How long the AI takes to start (or finish) replying.

