Autoregressive Model (AR)
Autoregressive models predict the next token in a sequence given the previous ones:
Examples:
- PixelRNN / PixelCNN (image)
- GPT, Transformer decoders (text)
- WaveNet (audio)
And it’s a term that seems to pop around. But I think that all it does is really allow you to predict (regress) future data based on past data. So isn’t this just a simple RNN?
They also mention this in the Neural Autoregressive Flow.
CS294
Assuming a fully expressive Bayes net structure, any joint distribution can be written as a product of conditionals