Glossary

What is transformer?

A transformer is the neural-network architecture behind modern language models, using self-attention to weigh how every token in the input relates to every other.

The transformer replaced recurrent networks by processing a whole sequence in parallel and letting each token attend to all others through self-attention. That design scales efficiently on modern hardware and underlies essentially every current large language model.

Open-source implementations of the architecture — and the model weights built on it — let teams study, run, and adapt transformers directly rather than only through a hosted API.

Best open-source LLMs →

Trending transformer projects

NVlabs/Sana
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
★ 7.6K+92 · 24hmomentum 11Python
kyegomez/OpenMythos
A theoretical reconstruction of the Claude Mythos architecture, built from first principles using the available research literature.
★ 13.4K+18 · 24hmomentum 9Python
rohitg00/ai-engineering-from-scratch
Learn it. Build it. Ship it for others.
★ 19.2K+331 · 24hmomentum 7Python

▌ transformer — FAQ

What is transformer?

A transformer is the neural-network architecture behind modern language models, using self-attention to weigh how every token in the input relates to every other. The transformer replaced recurrent networks by processing a whole sequence in parallel and letting each token attend to all others through self-attention. That design scales efficiently on modern hardware and underlies essentially every current large language model.

Trending transformer projects

NVlabs/Sana

kyegomez/OpenMythos

rohitg00/ai-engineering-from-scratch

▌ transformer — FAQ