HomeDatasetsmlfoundations/MINT-1T-HTML
M

mlfoundations/MINT-1T-HTML

Image To Text · mlfoundations· 158.1K
cc-by-4.0 23 TB task_categories:image-to-texttask_categories:text-generationlanguage:enlicense:cc-by-4.0size_categories:100M<n<1B

🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull mlfoundations/MINT-1T-HTML

Dataset details

Task
Image To Text
Language
en
License
cc-by-4.0
Size
23 TB
Rows / images
897.9K
Creator
mlfoundations
Downloads
158.1K
Source
huggingface_datasets
Updated
2024-09-21

About mlfoundations/MINT-1T-HTML

🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens