HomeModelsText To Audiofacebook/musicgen-medium
M

facebook/musicgen-medium

Text To Audio·facebook· 1.8M· 163
transformers cc-by-nc-4.0 arxiv:2306.05284license:cc-by-nc-4.0region:us

text to audio · transformers model

Open in MLForge Sign up free Desktop app Source ↗
# pull & run locally
pip install mlforge-sdk && mlforge pull facebook/musicgen-medium

Model details

Task
Text To Audio
Provider
facebook
Framework
transformers
License
cc-by-nc-4.0
Downloads
1.8M
Likes
163
Paper
arXiv:2306.05284
Updated
2023-11-17

About facebook/musicgen-medium

MusicGen is a text-to-music model capable of genreating high-quality music samples conditioned on text descriptions or audio prompts. It is a single stage auto-regressive Transformer model trained over a 32kHz EnCodec tokenizer with 4 codebooks sampled at 50 Hz. Unlike existing methods, like MusicLM, MusicGen doesn't require a self-supervised semantic representation, and it generates all 4 codebooks in one pass. By introducing a small delay between the codebooks, we show we can predict them in parallel, thus having only 50 auto-regressive steps per second of audio.