Generative Pre-Trained Transformer (GPT) GPT: 110M parameters https://watml.github.io/slides/CS480680_lecture12.pdf Next GPT-3 Related BERT