allenai/Molmo2-8B
Molmo2 is a family of open vision-language models developed by the Allen Institute for AI (Ai2) that support image, video and multi-image understanding and grounding. Molmo2 models are trained on publicly available third party datasets as referenced in our technical report and Molmo2 data, a collection of datasets with highly-curated image-text and video-text pairs. It has state-of-the-art perfor
pip install mlforge-sdk && mlforge pull allenai/Molmo2-8B
Model details
About allenai/Molmo2-8B
Molmo2 is a family of open vision-language models developed by the Allen Institute for AI (Ai2) that support image, video and multi-image understanding and grounding. Molmo2 models are trained on publicly available third party datasets as referenced in our technical report and Molmo2 data, a collection of datasets with highly-curated image-text and video-text pairs. It has state-of-the-art performance among multimodal models with a similar size. You can find all models in the Molmo2 family here.