当前位置：首页 > news >正文

Magenta RT 正式开源！实时生成多种风格音乐，让创作无门槛

news 2025/7/18 7:15:37

非音乐生可以做出自己的原创歌曲嘛？

当然可以！音乐就位，灯光就位，现在迎面向我们走来的是，Google 团队和它发布的开源音乐生成模型 Magenta RT。Google 团队这次又会给大家带来什么惊喜呢？

作为一款专为实时音乐生成设计的开源 AI 模型，Magenta RT 首先展示了其优秀的创作能力。
（因音频文件格式限制，这里无法展示，生成音乐详情可到https://go.openbayes.com/wWRcx查看～）

Magenta RT 不仅能够实时生成音乐，还可以根据用户提供的动态风格提示即时调整音乐风格，填补了生成模型与人类参与创作之间的空白。Magenta RT 支持 48 kHz 立体声音质，能够实现实时的语义控制，涵盖音乐流派、乐器选择和风格演变。生成速度达到每 2 秒音频只需要 1.25 秒，实现了接近实时的数据生成（RTF 约为 0.625）。这一突破性发布标志着 Google 在 AI 音乐创作领域的又一重要进展，为音乐创作者和开发者提供了全新的创作工具。

教程链接：https://go.openbayes.com/wWRcx

使用云平台: OpenBayes

http://openbayes.com/console/signup?r=sony_0m6v

首先点击「公共教程」，在公共教程中找到「Magenta RT: 流式音乐生成!」，单击打开。

在这里插入图片描述

页面跳转后，点击右上角「克隆」，将该教程克隆至自己的容器中。

在这里插入图片描述

在当前页面中看到的算力资源均可以在平台一键选择使用。平台会默认选配好原教程所使用的算力资源、镜像版本，不需要再进行手动选择。点击「继续执行」，等待分配资源。

在这里插入图片描述

数据和代码都已经同步完成了。容器状态显示为「运行中」后，点击「打开工作空间」，即可进入界面。

在这里插入图片描述

进入工作空间后，点击「README 文档」，可以看到该项目的教程简介和详细的教程示例。

在这里插入图片描述

教程示例

1、导入相关模块

import sys
import os
import nest_asyncio
nest_asyncio.apply() 
sys.path.append('/output/magenta-realtime')

2、初始化音乐生成器

os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python"
from magenta_rt import audio, system
mrt = system.MagentaRT(checkpoint_dir='/openbayes/input/input0/magenta-realtime/checkpoints/llm_large_x3047_c1860k')

2025-07-02 06:53:36.988764: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1751439217.001395     371 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1751439217.005072     371 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1751439217.015380     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015397     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015398     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015399     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.

3、设置相关参数

具体参数：

num_seconds：生成音乐的时长
prompt：音乐风格

prompt = 'Guitar'  # 在这里输入生成音乐的风格
num_seconds = 10  # 生成多少秒的音乐
from magenta_rt import system
sys = system.MockMagentaRT()
style = sys.embed_style(prompt)

4、生成音乐

from IPython.display import display, Audio
chunks = []
state = None
for i in range(round(num_seconds / mrt.config.chunk_length)):chunk, state = mrt.generate_chunk(state=state)chunks.append(chunk)
generated = audio.concatenate(chunks, crossfade_time=mrt.crossfade_length)
display(Audio(generated.samples.swapaxes(0, 1), rate=mrt.sample_rate))

WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:291, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:314, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:330, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:337, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:349, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:291, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:314, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:330, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:337, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:349, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
I0000 00:00:1751439284.552094     371 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4118 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:25:00.0, compute capability: 8.9
2025-07-02 06:56:14.534485: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 130.18GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.844065: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.910632: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.974383: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.

5、混合文本和音频风格

允许使用任意数量的文本和音频提示无缝混合风格。具体参数：

my_audio：音频文件（mp3、wav等格式）
prompt_style：融合风格

from magenta_rt import audio, system
import numpy as np
from IPython.display import display, Audiomy_audio = audio.Waveform.from_file('/openbayes/home/1.wav')  # 这里填写待融合音频文件所在的系统路径
prompt_style = 'funk'  # 这里填写待融合的风格weighted_styles = [(1.6, my_audio),      # 用户音频（权重，可自行微调）(1.4, prompt_style), # 文本风格（权重，可自行微调）
]style_vectors = []
for weight, source in weighted_styles:if isinstance(source, str):  style_vec = sys.embed_style(source)else: style_vec = sys.embed_style(source)style_vectors.append(style_vec)style_vectors = np.array(style_vectors)
weights = np.array([w for w, _ in weighted_styles])weights_norm = weights / weights.sum()
blended_style = (weights_norm[:, np.newaxis] * style_vectors).sum(axis=0)num_seconds = 10 
chunks = []
state = Nonenum_chunks = round(num_seconds / mrt.config.chunk_length)for i in range(num_chunks):chunk, state = mrt.generate_chunk(state=state)chunks.append(chunk)generated_audio = audio.concatenate(chunks, crossfade_time=mrt.crossfade_length)display(Audio(generated_audio.samples.swapaxes(0, 1), rate=mrt.sample_rate))

6、SpectroStream 对音频进行分词

SpectroStream 是一个离散音频编解码器模型，用于处理高保真音乐音频（立体声，48kHz）。在底层，Magenta RT 模型使用语言模型对 SpectroStream 音频标记进行建模。

naudio_path：音频文件（mp3、wav等格式）

from magenta_rt import audio, spectrostreamcodec = spectrostream.SpectroStream()
audio_path = '/openbayes/home/1.wav' # 这里填写待音频分词的音频所在系统路径
my_audio = audio.Waveform.from_file(audio_path) 
my_tokens = codec.encode(my_audio)
my_audio_reconstruction = codec.decode(my_tokens)
display(Audio(my_audio_reconstruction.samples.swapaxes(0, 1), rate=mrt.sample_rate))

查看全文

http://www.dtcms.com/a/284299.html