Focuses on Cross-Entropy Loss scaling laws. I’m curious to better understand the scaling laws for RL. There’s the paper Horizon Reduction Makes RL Scalable.
Focuses on Cross-Entropy Loss scaling laws. I’m curious to better understand the scaling laws for RL. There’s the paper Horizon Reduction Makes RL Scalable.