Categories: AI Technology

Stability AI Launches Stable Audio 2.0 with New Audio Generation Features

Stability AI recently launched Stable Audio 2.0, an advanced artificial intelligence model for generating high-quality audio. This model significantly evolves from its predecessor by introducing capabilities that could transform the landscape for creators involved in music, sound design, and multimedia production.

Introduction of Stable Audio 2.0
Stable Audio 2.0 is designed to produce tracks up to three minutes long, featuring coherent musical structure and high fidelity at 44.1 kHz stereo. This development is noteworthy because it expands the scope of AI in audio creation, moving beyond mere text-to-audio conversion to include audio-to-audio generation. Users can now upload audio samples and modify them using natural language prompts, enhancing the model’s utility and appeal.

Features and Capabilities


Full-Length Track Creation
Unlike other models, Stable Audio 2.0 can generate structured compositions, including introductions, developments, outros, and stereo sound effects. This feature allows for the creation of complete musical works that follow a narrative arc, which must often be improved in AI-generated music.

Audio-to-Audio Transformation
This version introduces the capability to upload audio files for transformation, broadening users’ creative possibilities. The model’s terms of service stipulate that uploaded content must be free from copyright restrictions, and it utilizes advanced content recognition technology to ensure compliance and prevent infringement.

Sound Effect and Style Transfer
Stable Audio 2.0 can produce sound effects and perform style transfers on newly generated or uploaded audio. This enables creators to customize the audio output to fit their projects’ thematic or stylistic requirements, from tapping on a keyboard to the ambient sounds of a bustling city.

Technological Foundations
The advancements in Stable Audio 2.0 are underpinned by a sophisticated architectural design tailored for audio generation. Key components include a highly compressed autoencoder and a diffusion transformer (DiT), which work in tandem to generate full tracks with detailed structures.

Latent Diffusion Model Architecture
The autoencoder compresses raw audio waveforms into compact representations, focusing on essential features while minimizing extraneous details. The DiT refines these representations into structured data, identifying complex patterns and relationships critical for producing high-quality musical compositions.

Screenshot

Performance and Quality
Integrating these technologies facilitates improvements in the performance and quality of the generated audio. The autoencoder allows for faster processing and generation of audio, making the model more accessible. Meanwhile, the DiT’s adeptness at handling long sequences ensures the output’s coherence and musical integrity.

Audio to Audio Feature Demo

Ethical Development and Creator Rights
Stability AI emphasizes ethical development practices and the protection of creator rights. Stable Audio 2.0 was developed using a licensed AudioSparx dataset comprising over 800,000 audio files. Artists could opt out of the dataset, ensuring that only those comfortable with their work being used for AI training were included.

To safeguard against copyright infringement, Stability AI has partnered with Audible Magic to employ its content recognition technology, ensuring that all uploaded content is original or appropriately licensed.

The launch of Stable Audio 2.0 represents a significant milestone in AI-generated audio. Stability AI has broadened the creative horizons of musicians and audio professionals by providing tools for full-length track creation, audio-to-audio transformation, and enhanced sound effect production. Moreover, the company’s commitment to ethical development practices and respect for creator rights sets a precedent for responsible AI innovation in the audio industry.

Source: Stability


Like this article?  Keep up to date with AI news, apps, tools and get tips and tricks on how to improve with AI.  Sign up to our Free AI Newsletter

Also, come check out our free AI training portal and community of business owners, entrepreneurs, executives and creators. Level up your business with AI ! New courses added weekly. 

You can also follow us on X

AI News

Recent Posts

Kling AI from Kuaishou Challenges OpenAI’s Sora

In February 2024, OpenAI introduced Sora, a video-generation model capable of creating one-minute-long, high-definition videos.…

6 months ago

Alibaba’s Qwen2 AI Model Surpasses Meta’s Llama 3

Alibaba Group Holding has unveiled Qwen2, the latest iteration of its open-source AI models, claiming…

6 months ago

Google Expands NotebookLM Globally with New Features

Google has rolled out a major update to its AI-powered research and writing assistant, NotebookLM,…

6 months ago

Stability AI’s New Model Generates Audio from Text

Stability AI, renowned for its revolutionary AI-powered art generator Stable Diffusion, now unveils a game-changing…

6 months ago

ElevenLabs Unveils AI Tool for Generating Sound Effects

ElevenLabs has unveiled its latest innovation: an AI tool capable of generating sound effects, short…

6 months ago

DuckDuckGo Introduces Secure AI Chat Portal

DuckDuckGo has introduced a revolutionary platform enabling users to engage with popular AI chatbots while…

6 months ago