HomeModelsAutomatic Speech Recognitionmistralai/Voxtral-Mini-4B-Realtime-2602
V

mistralai/Voxtral-Mini-4B-Realtime-2602

Automatic Speech Recognition·mistralai· 1.9M· 897
vllm apache-2.0 4.4B params

automatic speech recognition · vllm model

Open in MLForge Sign up free Desktop app Source ↗
# pull & run locally
pip install mlforge-sdk && mlforge pull mistralai/Voxtral-Mini-4B-Realtime-2602

Model details

Task
Automatic Speech Recognition
Provider
mistralai
Framework
vllm
Parameters
4.4B
License
apache-2.0
Downloads
1.9M
Likes
897
Paper
arXiv:2602.11298
Updated
2026-03-11

About mistralai/Voxtral-Mini-4B-Realtime-2602

Voxtral Mini 4B Realtime 2602 is a multilingual, realtime speech-transcription model and among the first open-source solutions to achieve accuracy comparable to offline systems with a delay of = 3600 / 0.8 = 45000. In theory, you should be able to record with no limit; in practice, pre-allocations of RoPE parameters among other things limits --max-model-len. For the best user experience, we recommend to simply instantiate vLLM with the default parameters which will automatically set a maximum model length of 131072 (~ca. 3h). - We strongly recommend using websockets to set up audio streaming sessions. For more info on how to do so, check Usage. - We recommend using a delay of 480ms as we found it to be the sweet spot of performance and low latency. If, however, you want to adapt the delay, you can change the "transcriptiondelayms": 480 parameter in the tekken.json file to any multiple of 80ms between 80 and 1200, as well as 2400 as a standalone value.

Related Automatic Speech Recognition

S pyannote/speaker-diarization-3.1 Automatic Speech Recognition 8.2M 2.5K 🤗 HF W argmaxinc/whisperkit-coreml Automatic Speech Recognition 8.1M 193 🤗 HF W openai/whisper-large-v3-turbo Automatic Speech Recognition ·808.9M params 7.1M 3.1K 🤗 HF W openai/whisper-base Automatic Speech Recognition ·72.6M params 6.2M 274 🤗 HF W jonatasgrosman/wav2vec2-large-xlsr-53-japanese Automatic Speech Recognition 6.1M 61 🤗 HF W openai/whisper-large-v3 Automatic Speech Recognition ·1.5B params 5.7M 5.9K 🤗 HF