M
mlfoundations/MINT-1T-HTML
Image To Text · mlfoundations
· 158.1K
cc-by-4.0
23 TB
task_categories:image-to-texttask_categories:text-generationlanguage:enlicense:cc-by-4.0size_categories:100M<n<1B
🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens
# download instantly
mlforge datasets pull mlfoundations/MINT-1T-HTML
Dataset details
Source
huggingface_datasets
About mlfoundations/MINT-1T-HTML
🍃 MINT-1T:Scaling Open-Source Multimodal Data by 10x: A Multimodal Dataset with One Trillion Tokens