Reinforcement Learning from Human Feedback (RLHF)

Resources

This is where AI Alignment is important, since the human feedback determines what are good and bad instructions.

Ok i don’t wanna keep copy pasting these slides but they are very good for explaining how things work.