A Reduction of Imitation Learning and Structured Prediction to No-Regret Online Learning

First heard about this paper when reading up on ALOHA.

DAgger improves on behavioral cloning by training on a dataset that better resembles the observations the trained policy is likely to encounter, but it requires querying the expert online.