HomeDatasetsCohereLabs/aya_collection
A

CohereLabs/aya_collection

Text Classification · CohereLabs· 28.8K
apache-2.0 176 GB task_categories:text-classificationtask_categories:summarizationtask_categories:translationlanguage:acelanguage:afr

This dataset is uploaded in two places: here and additionally here as 'Aya Collection Language Split.' These datasets are identical in content but differ in structure of upload. This dataset is structured by folders split according to dataset name. The version here instead divides the Aya collection into folders split by language. We recommend you use the language split version if you are only int

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull CohereLabs/aya_collection

Dataset details

Task
Text Classification
Language
ace
License
apache-2.0
Size
176 GB
Rows / images
513.8M
Creator
CohereLabs
Downloads
28.8K
Source
huggingface_datasets
Updated
2025-04-15

About CohereLabs/aya_collection

This dataset is uploaded in two places: here and additionally here as 'Aya Collection Language Split.' These datasets are identical in content but differ in structure of upload. This dataset is structured by folders split according to dataset name. The version here instead divides the Aya collection into folders split by language. We recommend you use the language split version if you are only interested in downloading data for a single or smaller set of languages, and this version if you want to download dataset according to data source or the entire collection.