Expressive Policy

This is an idea that I stumbled upon while reading a paper written by Perry Dong.

EXPO: https://arxiv.org/pdf/2507.07986

Policy TypeExpressiveness
Linear policyLow
Shallow MLP (1 hidden layer)Medium
Deep MLP (many hidden layers, nonlinearities)High
Transformer policyVery High
Diffusion-based policyExtremely High

🧨 This limits it to:

  • Single-mode behaviors (deterministic or unimodal)