HomeModelsText Generationnvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4
N

nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

Text Generation·nvidia· 1.3M· 363
transformers other 67.2B params

:---:--- Total Parameters 120B (12B active) Architecture LatentMoE - Mamba-2 + MoE + Attention hybrid with Multi-Token Prediction (MTP) Context Length Up to 1M tokens Minimum GPU Requirement 1× B200 OR 1× DGX Spark Supported Languages English, French, German, Italian, Japanese, Spanish, Chinese Best For Agentic workflows, long-context reasoning, high-volume workloads (e.g. IT ticke

Open in MLForge Sign up free Desktop app Source ↗
# pull & run locally
pip install mlforge-sdk && mlforge pull nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

Model details

Task
Text Generation
Provider
nvidia
Framework
transformers
Parameters
67.2B
Size
75 GB
License
other
Downloads
1.3M
Likes
363
Paper
arXiv:2512.20848
Updated
2026-05-01

About nvidia/NVIDIA-Nemotron-3-Super-120B-A12B-NVFP4

:---:--- Total Parameters 120B (12B active) Architecture LatentMoE - Mamba-2 + MoE + Attention hybrid with Multi-Token Prediction (MTP) Context Length Up to 1M tokens Minimum GPU Requirement 1× B200 OR 1× DGX Spark Supported Languages English, French, German, Italian, Japanese, Spanish, Chinese Best For Agentic workflows, long-context reasoning, high-volume workloads (e.g. IT ticket automation), tool use, RAG Reasoning Mode Configurable on/off via chat template (enablethinking=True/False) License NVIDIA Nemotron Open Model License Release Date March 11, 2026

Related Text Generation

Q Qwen/Qwen3-0.6B Text Generation ·751.6M params 27.8M 1.4K 🤗 HF Q Qwen/Qwen3-4B Text Generation ·4.0B params 16.4M 641 🤗 HF G openai-community/gpt2 Text Generation ·137.0M params 13.3M 3.3K 🤗 HF Q Qwen/Qwen3-8B Text Generation ·8.2B params 13.0M 1.2K 🤗 HF Q Qwen/Qwen2.5-7B-Instruct Text Generation ·7.6B params 12.8M 1.4K 🤗 HF O facebook/opt-125m Text Generation 12.3M 267 🤗 HF