🛠️ Steven Gong

Search

RT-1-X
Related

Mar 17, 2025, 1 min read

RT-1-X

My question is, how is it trained on some of the datasets where there are no instruction annotations??

Then, the instruction is just empty

“It takes in a history of 15 images along with the natural language“.

This is essentially VLA.

Related

Open-X Embodiment
RT-H

Graph View

Backlinks

EfficientNet
Open-X Embodiment
Universal Sentence Encoder (USE)

Created with Quartz, © 2025

Blog
LinkedIn
Twitter
GitHub