by Crystal Koe

When you purchase through affiliate links on MusicTech.com, you may contribute to our site through commissions. Learn more

News

Stability AI’s new open source text-to-audio generator was trained on free music libraries to “respect creator rights”

The model was trained using audio data from free music libraries Freesound and the Free Music Archive.

by Crystal Koe

When you purchase through affiliate links on MusicTech.com, you may contribute to our site through commissions. Learn more

Image: Stability AI

Get MusicTech breaking news as it happens by following us on Telegram: https://t.me/MusicTechOfficial

Stability AI, the company behind AI-powered image generator Stable Diffusion, has launched Stable Audio Open, an open source model for generating short audio samples, sound effects and production elements using text prompts.

READ MORE: “Sampling will always be a double-edged sword”: Flamingosis talks modern hip-hop production on ‘Better Will Come’

The new model was trained on audio data from free music libraries Freesound and the Free Music Archive. “This allowed us to create an open audio model while respecting creator rights,” says Stability AI. The company adds that Stable Audio Open’s specialised training makes it ideal for creating drum beats, instrument riffs, ambient sounds, foley recordings and other audio samples for music production and sound design.

Users can generate up to 47 seconds of audio data by inputting text descriptions like “warm arpeggios on an analog synthesizer with a gradually rising filter cutoff and a reverb tail” and “rock beat played in a treated studio, session drumming on an acoustic kit”.

One key advantage of the open source release is that users can fine-tune the model on their own custom audio data. For example, a drummer could fine-tune on samples of their own drum recordings to generate new beats.

That said, while Stable Audio Open can generate short musical clips, it is not optimised for full songs, melodies or vocals unlike the company’s flagship Stable Audio service. The latter is able to produce tracks with coherent musical structure up to three minutes in length, and offers advanced capabilities like audio-to-audio generation and coherent multi-part musical compositions.

According to Stability AI, the open source model “provides a glimpse into generative AI for sound design while prioritising responsible development alongside creative communities.”

The company’s latest focus on ‘responsible audio generation’ follows the high-profile exit of its VP of generative audio, Ed Newton-Rex, last November, who quit due to disagreements with the firm over what constitutes “fair use” of copyrighted works.

The former executive said he disagreed “with the company’s opinion that training generative AI models on copyrighted works.” Newton-Rex also told the BBC that he thought it was “exploitative” for developers to use creative work without consent – a stance he claimed many AI firms, including Stability AI, would beg to differ.

Stability AI

#AI

Get the latest news, reviews and tutorials to your inbox.

Subscribe

Stability AI’s new open source text-to-audio generator was trained on free music libraries to “respect creator rights”

Trending Now

1Record labels sue AI music generators Suno and Udio alleging “unimaginable scales” of copyright infringement

2Best free loops, music sample packs, breaks, one-shots and synth sounds for all genres

3How to use mix correction plugins for creative sound design

4Beetlecrab Tempera: ‘As soon as we placed our hand on the grid and played a chord, we knew immediately, ‘Okay, this is it’”

512 best stem separation software for vocals, ranked