Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes They talk about C51 paper for doing distributional RL.