HomeDatasetsjapanese-asr/whisper_transcriptions.reazon_speech_all
W

japanese-asr/whisper_transcriptions.reazon_speech_all

General · japanese-asr· 140.5K
Unknown 2.3 TB size_categories:10M<n<100Mformat:parquetmodality:audiomodality:textlibrary:datasets

HuggingFace Dataset

Open in MLForge Sign up free Desktop app
# download instantly
mlforge datasets pull japanese-asr/whisper_transcriptions.reazon_speech_all

Dataset details

Task
General
License
Unknown
Size
2.3 TB
Rows / images
17.3M
Creator
japanese-asr
Downloads
140.5K
Source
huggingface_datasets
Updated
2024-09-14

About japanese-asr/whisper_transcriptions.reazon_speech_all

--- datasetinfo: - configname: subset0 features: - name: audio dtype: audio: samplingrate: 16000 - name: transcription dtype: string - name: transcription/engpt3.5 dtype: string - name: whispertranscription sequence: int64 - name: whispertranscription/engpt3.5 sequence: int64 splits: - name: train numbytes: 12059096252.0 numexamples: 82105 downloadsize: 11943682535 datasetsize: 12059096252.0 - configname: subset1 features: - name: audio dtype: audio: samplingrate: 16000 - name: transcription dtype: string - name: transcription/engpt3.5 dtype: string - name: whispertranscription sequence: int64 - name: whispertranscription/engpt3.5 sequence: int64 splits: - name: train numbytes: 12030017758.0 numexamples: 82105 downloadsize: 11915679367 datasetsize: 12030017758.0 - configname: subset2 features: - name: audio dtype: audio: samplingrate: 16000 - name: transcription dtype: string - name: transcription/engpt3.5 dtype: string - name: whispertranscription sequence: int64 - name: whispertranscription/engpt3.5 sequence: int64 splits: - name: train numbytes: 12050113720.0 numexamples: 82105 downloadsize: 11935583171 datasetsize: 12050113720.0 - configname: subset3 features: - name: audio dtype: audio: samplingrate: 16000 - name: transcription dtype: string - name: transcription/engpt3.5 dtype: string - name: whispertranscription sequence: int64 - name: whisper