Scaling Laws
How do different scaling laws apply in different contexts? I’m going to spend a lot of time understanding and thinking about scaling laws and how they apply to Reinforcement Learning.
There’s Moore Law.
Resources:
- https://cameronrwolfe.substack.com/p/llm-scaling-laws
- https://medium.com/sage-ai/demystify-transformers-a-comprehensive-guide-to-scaling-laws-attention-mechanism-fine-tuning-fffb62fc2552
Papers:
Neural Scaling Law
https://en.wikipedia.org/wiki/Neural_scaling_law
For RL, see Scaling RL.