Scalable Diffusion Models with Transformers (DiT) Feel like I should really read this to understand diffusion transformers.