Robot Generalization
Generalization is a loaded word in robot learning. What does it actually mean? There are many axes of generalization.
Off the top of my head:
- Task generalization
- Embodiment generalization
- Environment generalization
- Object generalization
From Gemini Robotics Bringing AI into the Physical World
Visual Generalization: The model should be invariant to visual changes of the scene that do not affect the actions required to solve the task. These visual changes can include variations in background, lighting conditions, distractor objects or textures.
Instruction Generalization: The model should understand invariance and equivalence in natural language instructions. Going beyond fine-grained steerability studied in Section 3.3, the model should understand paraphrasing, be robust to typos, understand different languages, and varying levels of specificities.
Action Generalization: The model should be capable of adapting learned movements or synthesizing new ones, for instance to generalize to initial conditions (e.g., object placement) or object instances (e.g., shape or physical properties) not seen during training