OpenVLA-OFT
Paper that introduces finetuning for OpenVLA.
Links
- https://openvla-oft.github.io/
- https://arxiv.org/html/2502.19645v1
- https://github.com/moojink/openvla-oft
Architecture
Two main contributions:
- Add parallel decoding
- Film for better adherence to instructions
Parallel decoding seems like an interesting way to increase inference speed.
Has some really really good visualizations of different tasks and how they fail