HomeDatasetsSkylion007/openwebtext
O

Skylion007/openwebtext

Text Generation · Skylion007· 72.0K
["cc0-1.0"] 80 GB task_categories:text-generationtask_categories:fill-masktask_ids:language-modelingtask_ids:masked-language-modelingannotations_creators:no-annotation

Table of Contents - Dataset Description - Dataset Summary - Supported Tasks and Leaderboards - Languages - Dataset Structure - Data Instances - Data Fields - Data Splits - Dataset Creation - Curation Rationale - Source Data - Annotations - Personal and Sensitive Information - Considerations for Using the Data - Social Impact of Dataset - Discussion of Biases - Other Known

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull Skylion007/openwebtext

Dataset details

Task
Text Generation
Language
en
License
["cc0-1.0"]
Size
80 GB
Rows / images
8.0M
Classes
1
Creator
Skylion007
Downloads
72.0K
Source
huggingface_datasets
Updated
2025-12-26

About Skylion007/openwebtext

Table of Contents - Dataset Description - Dataset Summary - Supported Tasks and Leaderboards - Languages - Dataset Structure - Data Instances - Data Fields - Data Splits - Dataset Creation - Curation Rationale - Source Data - Annotations - Personal and Sensitive Information - Considerations for Using the Data - Social Impact of Dataset - Discussion of Biases - Other Known Limitations - Additional Information - Dataset Curators - Licensing Information - Citation Information - Contributions