Building GPT From Scratch

Stumbled upon an interesting video by Andrej Karpathy talking about building Generatively Pretrained Transformer (GPT). I definitely need to take some time to start looking through the papers linked in the video on wonder how things work out.