Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions

“While in principle simply replacing existing architectures (e.g., ResNets [15] or smaller convolutional neural networks [11, 14]) with a Transformer is conceptually straightforward, devising a methodology that effectively makes use of such architectures is considerably more challenging. High-capacity models only make sense when we train on large and diverse datasets – small, narrow datasets simply do not require this much capacity and do not benefit from it.”