F
anisoleai/fineweb-tokenized
Text Generation · anisoleai
· 131.5K
odc-by
7.5 TB
task_categories:text-generationlanguage:enlicense:odc-bysize_categories:n>1Tmodality:tabular
4 trillion tokens of the pre-tokenized data the 🌐 web has to offer
# download instantly
mlforge datasets pull anisoleai/fineweb-tokenized
Dataset details
Source
huggingface_datasets
About anisoleai/fineweb-tokenized
4 trillion tokens of the pre-tokenized data the 🌐 web has to offer