deepseek-ai/deepseek-vl2-tiny

Image Text To Text·deepseek-ai· 910.7K· 248

transformers other 3.4B params arxiv:2412.10302license:otherregion:us

Open in MLForge Sign up free Desktop app Source ↗

# pull & run locally
pip install mlforge-sdk && mlforge pull deepseek-ai/deepseek-vl2-tiny

Model details

Task

Image Text To Text

Provider

deepseek-ai

Framework

transformers

Parameters

3.4B

Size

6.3 GB

License

other

Downloads

910.7K

Likes

248

Paper

arXiv:2412.10302

Updated

2024-12-18

About deepseek-ai/deepseek-vl2-tiny

Introducing DeepSeek-VL2, an advanced series of large Mixture-of-Experts (MoE) Vision-Language Models that significantly improves upon its predecessor, DeepSeek-VL. DeepSeek-VL2 demonstrates superior capabilities across various tasks, including but not limited to visual question answering, optical character recognition, document/table/chart understanding, and visual grounding. Our model series is composed of three variants: DeepSeek-VL2-Tiny, DeepSeek-VL2-Small and DeepSeek-VL2, with 1.0B, 2.8B and 4.5B activated parameters respectively. DeepSeek-VL2 achieves competitive or state-of-the-art performance with similar or fewer activated parameters compared to existing open-source dense and MoE-based models.

Related Image Text To Text

G google/gemma-4-26B-A4B-it Image Text To Text ·26.5B params 13.1M 1.2K 🤗 HF G google/gemma-4-31B-it Image Text To Text ·32.7B params 11.2M 3.1K 🤗 HF Q Qwen/Qwen3.5-9B Image Text To Text ·9.7B params 9.8M 1.6K 🤗 HF Q Qwen/Qwen3.5-4B Image Text To Text ·4.7B params 9.6M 683 🤗 HF Q Qwen/Qwen2.5-VL-7B-Instruct Image Text To Text ·8.3B params 9.4M 1.6K 🤗 HF Q Qwen/Qwen3.6-35B-A3B-FP8 Image Text To Text ·36.0B params 5.8M 284 🤗 HF