Performance
Latency
In one line
How long the AI takes to start (or finish) replying.
What does Latency mean?
Latency matters most for real-time chat, voice, and agents. Reasoning models trade latency for accuracy; smaller models like Gemini Flash trade a bit of quality for speed.
A real-world example
A voice assistant needs <500ms to feel natural; a batch summariser can tolerate 30s.

