Reinforcement Learning with Action Chunking

https://arxiv.org/pdf/2507.07969

Paper that just came out in jul 2025.

“In the adjacent field of imitation learning (IL), a widely used approach in recent years has been to employ action chunking, where instead of training policies to predict a single action based on the state observation from prior data, the policy is instead trained to predict a short sequence of future actions (an “action chunk”) [82, 11]. While a complete explanation for the effectiveness of action chunking in IL remains an open question, its effectiveness can be at least partially ascribed to better handling of non-Markovian behavior in the offline data”.

  • The current state does not have enough information

RL is built on the markov assumption, that seems like quite a limitation.