🛠️ Steven Gong

Search

Bidirectional Encoder Representations from Transformers (BERT)
RoBERTA

Aug 24, 2025, 1 min read

Bidirectional Encoder Representations from Transformers (BERT)

BERT is a bidirectional Transformer. BERT is not a generative model. It’s an encoder only.

Bert tries to predict the masked token.

Resources

Original paper https://arxiv.org/pdf/1810.04805
https://watml.github.io/slides/CS480680_lecture12.pdf

RoBERTA

This is just training BERT on more images.

Graph View

Backlinks

Generative Pre-Trained Transformer (GPT)
Natural Language Processing

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub