World Model

A world model is a learned (or designed) internal representation of the environment in which an agent (like a robot) operates.

The original world model idea:

Some papers:

Links sent by jason:

Faraz also told me some

Let’s reason from first principles. How do we build a world model?

It’s a essentially a model of the world. You can interact with it.

So you are essentially learning the dynamics: .

That’s pretty straightforward, but what about images? You want to learn physics and then how light updates.

in the original paper, they just use a VAE, to convert it into some latent vector.

Then,, they use a Mixture of Gaussians.

First, they have a vision encoder. Then, they have this RNN that predicts