Task
Robotics
robotics · transformers model
OpenVLA 7B (openvla-7b) is an open vision-language-action model trained on 970K robot manipulation episodes from the Open X-Embodiment dataset. The model takes language instructions and camera images as input and generates robot actions. It supports controlling multiple robots out-of-the-box, and can be quickly adapted for new robot domains via (parameter-efficient) fine-tuning.