Vision-Language Action Model (VLA) This was first introduced through RT-2. https://robotics-transformer2.github.io/ Models: OpenVLA