Offline Q-Learning on Diverse Multi-Task Data Both Scales And Generalizes

They talk about C51 paper for doing distributional RL.