Task
Automatic Speech Recognition
VAANI is an India-representative multi-modal multi-lingual dataset. The current version (phase 1- 80 districts, phase 2- 85 districts) contains ~31,255 hours of spontaenous,image-prompted speech by 156K speakers across 165 districts, talking about 288K images covering 106 languages. From this audio data, 2,043 hours of transcribed data(text) is available, spanning almost evenly across the 165 districts. Project Vaani, by IISc, Bangalore and ARTPARK, is capturing the true diversity of India’s… See the full description on the dataset page: https://huggingface.co/datasets/ARTPARK-IISc/Vaani.