Architecture

Transformer

Also known as: transformers

In one line

The neural network architecture behind almost every modern LLM.

What does Transformer mean?

Introduced by Google in 2017 ("Attention Is All You Need"), the transformer uses self-attention to process sequences in parallel. It's the foundation of GPT, Claude, Gemini, and Llama.

A real-world example

The "T" in GPT stands for Transformer.

Related terms