Reinforcement Learning from Human Feedback (RLHF)
Resources
This is where AI Alignment is important, since the human feedback determines what are good and bad instructions.
Ok i don’t wanna keep copy pasting these slides but they are very good for explaining how things work.