RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control https://arxiv.org/pdf/2307.15818