Scaling Instructable Agents Across Many Simulated Worlds (SIMA)
Blog:
This is not a generative model, its just an agent. However, thereās no RL involved. Very similar to the LLM paradigm. Always off-policy.

It gets ground truth from how to play a game, the quality of data seems quite important here?c