Member-only story
Intro to Transformer AI Models — The 18 Building Blocks of the Transformer Model in Action: A Language Translation Example

In my previous article, we’ve looked at the 5 most important characteristics of how a Transformer models works — those were important fundamentals to advance our knowledge.
Now, we take it a step further. Under the hood of the Transformer model, we find different building blocks that bring a Transformer model like ChatGPT to action.
We will go through them in the same sequence that a Transformer model also uses them to carry out its tasks, such as language translation.
For a quick glimpse at them, I’ll provide a TL:DR, but recommend to read the deep dive further below.
TL:DR
For a Transformer model, the steps shown in the architecture picture from the “Attention is all you need” paper (further below) are needed to generate accurate text for instance, a translation from English language to German.
The model conducts several transformations from text to numbers and to many more numbers and uses several calculations to learn about relationships between words in the input sentence and the generated output.
I’ve summarized the steps that the Transformer is carrying out for a language…