Task
Audio Text To Text
Ultravox is a multimodal Speech LLM built around a pretrained LLM (Llama, Gemma, Qwen, etc) and a speech encoder (whisper-large-v3-turbo) backbone.
Ultravox is a multimodal Speech LLM built around a pretrained LLM (Llama, Gemma, Qwen, etc) and a speech encoder (whisper-large-v3-turbo) backbone.