Task
Image Text To Text
This repository contains an FP8 quantized version of the Qwen3-VL-32B-Thinking model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!
This repository contains an FP8 quantized version of the Qwen3-VL-32B-Thinking model. The quantization method is fine-grained fp8 quantization with block size of 128, and its performance metrics are nearly identical to those of the original BF16 model. Enjoy!