Attention Is All You Need
First annotated walkthrough in progress, starting with the abstract, introduction, BLEU, recurrence, and why attention changes the computation.
Reading
Transformers
Attention
Machine translation
8 notes