Robot Generalization

Generalization is a loaded word in robot learning. What does it actually mean? There are many axes of generalization.

Off the top of my head:

Task generalization
Embodiment generalization
Environment generalization
Object generalization

From Gemini Robotics Bringing AI into the Physical World

Visual Generalization: The model should be invariant to visual changes of the scene that do not affect the actions required to solve the task. These visual changes can include variations in background, lighting conditions, distractor objects or textures.

Instruction Generalization: The model should understand invariance and equivalence in natural language instructions. Going beyond fine-grained steerability studied in Section 3.3, the model should understand paraphrasing, be robust to typos, understand different languages, and varying levels of specificities.

Action Generalization: The model should be capable of adapting learned movements or synthesizing new ones, for instance to generalize to initial conditions (e.g., object placement) or object instances (e.g., shape or physical properties) not seen during training

🛠️ Steven Gong

Robot Generalization

Graph View

Backlinks