HomeDatasetsallenai/dolma3_mix-6T-1025-7B
D

allenai/dolma3_mix-6T-1025-7B

Text Generation · allenai· 70.7K
odc-by 8.9 TB task_categories:text-generationlanguage:enlicense:odc-byarxiv:2512.13961region:us

⚠️ WARNING: This dataset is intended ONLY for reproducing Olmo 3 7B ⚠️ For all other training use cases, including training from scratch, please utilize our primary dolma 3 data mix: https://huggingface.co/datasets/allenai/dolma3mix-6T.

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull allenai/dolma3_mix-6T-1025-7B

Dataset details

Task
Text Generation
Language
en
License
odc-by
Size
8.9 TB
Creator
allenai
Downloads
70.7K
Source
huggingface_datasets
Updated
2026-01-15

About allenai/dolma3_mix-6T-1025-7B

⚠️ WARNING: This dataset is intended ONLY for reproducing Olmo 3 7B ⚠️ For all other training use cases, including training from scratch, please utilize our primary dolma 3 data mix: https://huggingface.co/datasets/allenai/dolma3mix-6T.