HomeDatasetsarsaporta/symile-m3
S

arsaporta/symile-m3

Zero Shot Classification · arsaporta· 56.7K
cc-by-nc-sa-4.0 7.2 TB task_categories:zero-shot-classificationtask_categories:zero-shot-image-classificationlanguage:arlanguage:ellanguage:en

Dataset Card for Symile-M3 Symile-M3 is a multilingual dataset of (audio, image, text) samples. The dataset is specifically designed to test a model's ability to capture higher-order information between three distinct high-dimensional data types: by incorporating multiple languages, we construct a task where text and audio are both needed to predict the image, and where, importantly, neither text

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull arsaporta/symile-m3

Dataset details

Task
Zero Shot Classification
Language
ar
License
cc-by-nc-sa-4.0
Size
7.2 TB
Rows / images
53.4M
Creator
arsaporta
Downloads
56.7K
Source
huggingface_datasets
Updated
2024-11-26

About arsaporta/symile-m3

Dataset Card for Symile-M3 Symile-M3 is a multilingual dataset of (audio, image, text) samples. The dataset is specifically designed to test a model's ability to capture higher-order information between three distinct high-dimensional data types: by incorporating multiple languages, we construct a task where text and audio are both needed to predict the image, and where, importantly, neither text nor audio alone would suffice. - Paper: https://arxiv.org/abs/2411.01053 - GitHub: https://github.com/rajesh-lab/symile - Questions & Discussion: https://www.alphaxiv.org/abs/2411.01053v1