Qwen/Qwen3-VL-32B-Thinking-FP8

Image Text To Text·Qwen· 346.3K· 26

transformers apache-2.0 33.4B params arxiv:2505.09388arxiv:2502.13923arxiv:2409.12191arxiv:2308.12966license:apache-2.0

This repository contains an FP8 quantized version of the Qwen3-VL-32B-Thinking model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!

Open in MLForge Sign up free Desktop app Source ↗

# pull & run locally
pip install mlforge-sdk && mlforge pull Qwen/Qwen3-VL-32B-Thinking-FP8

Model details

Task

Image Text To Text

Provider

Qwen

Framework

transformers

Parameters

33.4B

Size

33 GB

License

apache-2.0

Downloads

346.3K

Likes

Paper

arXiv:2505.09388

Updated

2025-11-26

About Qwen/Qwen3-VL-32B-Thinking-FP8

Related Image Text To Text

G google/gemma-4-26B-A4B-it Image Text To Text ·26.5B params 13.1M 1.2K 🤗 HF G google/gemma-4-31B-it Image Text To Text ·32.7B params 11.2M 3.1K 🤗 HF Q Qwen/Qwen3.5-9B Image Text To Text ·9.7B params 9.8M 1.6K 🤗 HF Q Qwen/Qwen3.5-4B Image Text To Text ·4.7B params 9.6M 683 🤗 HF Q Qwen/Qwen2.5-VL-7B-Instruct Image Text To Text ·8.3B params 9.4M 1.6K 🤗 HF Q Qwen/Qwen3.6-35B-A3B-FP8 Image Text To Text ·36.0B params 5.8M 284 🤗 HF