nvidia/Gemma-4-26B-A4B-NVFP4
text generation · Model Optimizer model
pip install mlforge-sdk && mlforge pull nvidia/Gemma-4-26B-A4B-NVFP4
Model details
About nvidia/Gemma-4-26B-A4B-NVFP4
Description: Gemma 4 26B IT is an open multimodal model built by Google DeepMind that handles text and image inputs, can process video as sequences of frames, and generates text output. It is designed to deliver frontier-level performance for reasoning, agentic workflows, coding, and multimodal understanding on consumer GPUs and workstations, with a 256K-token context window and support for over 140 languages. The model uses a hybrid attention mechanism that interleaves local sliding-window and full global attention, with unified Keys and Values in global layers and Proportional RoPE (p-RoPE) to support long-context performance. The NVIDIA Gemma 4 26B IT NVFP4 model is quantized with NVIDIA Model Optimizer.