JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models
Futureverse AI Innovation & JEN Music AI
Abstract
Music generation has attracted growing interest with the advancement of deep generative models. However,
generating music conditioned on textual descriptions, known as text-to-music, remains challenging due to the
complexity of musical structures and high sampling rate requirements. Despite the task’s significance,
prevailing generative models exhibit limitations in music quality, computational efficiency, and generalization.
This paper introduces JEN-1, a universal high-fidelity model for text-to-music generation. JEN-1 is a diffusion
model incorporating both autoregressive and non-autoregressive training. Through in-context learning, JEN-1
performs various generation tasks including text-guided music generation, music inpainting, and continuation.
Evaluations demonstrate JEN-1’s superior performance over state-of-the-art methods in text-music alignment and
music quality while maintaining computational efficiency.
Comparison between music generative models
In the following, we compare the output of JEN-1 to the outputs of other state-of-the-art music generative
models (MusicGen, MusicLM, Riffusion and Mousai).
Prompt
JEN-1
MusicGen
MusicLM
Riffusion
Mousai
A punchy double-bass and a distorted guitar riff
Lofi slow bpm electro chill with organic samples
Smooth jazz, with a saxophone solo, piano chords, and snare full drums
A grand orchestral arrangement with thunderous percussion, epic brass fanfares, and soaring strings, creating
a cinematic atmosphere fit for a heroic battle
A dynamic blend of hip-hop and orchestral elements, with sweeping strings and brass
Pop dance track with catchy melodies, tropical percussion, and upbeat rhythms, perfect for the beach
Classic reggae track with an electronic guitar solo
Earthy tones, environmentally conscious, ukulele-infused, harmonic, breezy, easygoing, organic
instrumentation, gentle grooves
Violins and synths that inspire awe at the finiteness of life and the universe
80s electronic track with melodic synthesizers, catchy beat and groovy bass
A piano and cello duet playing a sad chambers music
A light and cheerly EDM track, with syncopated drums, aery pads, and strong emotions
Acoustic folk song to play during roadtrips: guitar flute choirs
Rock with saturated guitars, a heavy bass line and crazy drum break and fills