CohereLabs/aya_collection
This dataset is uploaded in two places: here and additionally here as 'Aya Collection Language Split.' These datasets are identical in content but differ in structure of upload. This dataset is structured by folders split according to dataset name. The version here instead divides the Aya collection into folders split by language. We recommend you use the language split version if you are only int
mlforge datasets pull CohereLabs/aya_collection
Dataset details
About CohereLabs/aya_collection
This dataset is uploaded in two places: here and additionally here as 'Aya Collection Language Split.' These datasets are identical in content but differ in structure of upload. This dataset is structured by folders split according to dataset name. The version here instead divides the Aya collection into folders split by language. We recommend you use the language split version if you are only interested in downloading data for a single or smaller set of languages, and this version if you want to download dataset according to data source or the entire collection.