当前位置: 首页 > news >正文

Magenta RT 正式开源!实时生成多种风格音乐,让创作无门槛

非音乐生可以做出自己的原创歌曲嘛?

当然可以!音乐就位,灯光就位,现在迎面向我们走来的是,Google 团队和它发布的开源音乐生成模型 Magenta RT。Google 团队这次又会给大家带来什么惊喜呢?

作为一款专为实时音乐生成设计的开源 AI 模型,Magenta RT 首先展示了其优秀的创作能力。
(因音频文件格式限制,这里无法展示,生成音乐详情可到https://go.openbayes.com/wWRcx查看~)

Magenta RT 不仅能够实时生成音乐,还可以根据用户提供的动态风格提示即时调整音乐风格,填补了生成模型与人类参与创作之间的空白。Magenta RT 支持 48 kHz 立体声音质,能够实现实时的语义控制,涵盖音乐流派、乐器选择和风格演变。生成速度达到每 2 秒音频只需要 1.25 秒,实现了接近实时的数据生成(RTF 约为 0.625)。这一突破性发布标志着 Google 在 AI 音乐创作领域的又一重要进展,为音乐创作者和开发者提供了全新的创作工具。

教程链接:https://go.openbayes.com/wWRcx

使用云平台: OpenBayes

http://openbayes.com/console/signup?r=sony_0m6v

首先点击「公共教程」,在公共教程中找到「Magenta RT: 流式音乐生成!」,单击打开。

在这里插入图片描述

页面跳转后,点击右上角「克隆」,将该教程克隆至自己的容器中。

在这里插入图片描述

在当前页面中看到的算力资源均可以在平台一键选择使用。平台会默认选配好原教程所使用的算力资源、镜像版本,不需要再进行手动选择。点击「继续执行」,等待分配资源。

在这里插入图片描述

在这里插入图片描述

数据和代码都已经同步完成了。容器状态显示为「运行中」后,点击「 打开工作空间」,即可进入界面。

在这里插入图片描述

进入工作空间后,点击「README 文档」,可以看到该项目的教程简介和详细的教程示例。

在这里插入图片描述

在这里插入图片描述

教程示例

1、导入相关模块

import sys
import os
import nest_asyncio
nest_asyncio.apply() 
sys.path.append('/output/magenta-realtime')

2、初始化音乐生成器

os.environ["PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION"] = "python"
from magenta_rt import audio, system
mrt = system.MagentaRT(checkpoint_dir='/openbayes/input/input0/magenta-realtime/checkpoints/llm_large_x3047_c1860k')
2025-07-02 06:53:36.988764: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1751439217.001395     371 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1751439217.005072     371 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1751439217.015380     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015397     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015398     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1751439217.015399     371 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.

3、设置相关参数

具体参数:

  • num_seconds:生成音乐的时长
  • prompt:音乐风格
prompt = 'Guitar'  # 在这里输入生成音乐的风格
num_seconds = 10  # 生成多少秒的音乐
from magenta_rt import system
sys = system.MockMagentaRT()
style = sys.embed_style(prompt)

4、生成音乐

from IPython.display import display, Audio
chunks = []
state = None
for i in range(round(num_seconds / mrt.config.chunk_length)):chunk, state = mrt.generate_chunk(state=state)chunks.append(chunk)
generated = audio.concatenate(chunks, crossfade_time=mrt.crossfade_length)
display(Audio(generated.samples.swapaxes(0, 1), rate=mrt.sample_rate))
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:291, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:314, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:330, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:337, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:349, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:291, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:314, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:330, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:337, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
WARNING:absl:In /openbayes/input/input0/python310/lib/python3.10/site-packages/flaxformer/architectures/t5/t5_architecture.py:349, activation_partitioning_dims was set, but it is deprecated and will be removed soon.
I0000 00:00:1751439284.552094     371 gpu_device.cc:2019] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 4118 MB memory:  -> device: 0, name: NVIDIA GeForce RTX 4090, pci bus id: 0000:25:00.0, compute capability: 8.9
2025-07-02 06:56:14.534485: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 130.18GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.844065: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.910632: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.
2025-07-02 06:56:14.974383: W external/xla/xla/tsl/framework/bfc_allocator.cc:310] Allocator (GPU_0_bfc) ran out of memory trying to allocate 32.52GiB with freed_by_count=0. The caller indicates that this is not a failure, but this may mean that there could be performance gains if more memory were available.

5、混合文本和音频风格

允许使用任意数量的文本和音频提示无缝混合风格。 具体参数:

  • my_audio:音频文件(mp3、wav等格式)
  • prompt_style:融合风格
from magenta_rt import audio, system
import numpy as np
from IPython.display import display, Audiomy_audio = audio.Waveform.from_file('/openbayes/home/1.wav')  # 这里填写待融合音频文件所在的系统路径
prompt_style = 'funk'  # 这里填写待融合的风格weighted_styles = [(1.6, my_audio),      # 用户音频(权重,可自行微调)(1.4, prompt_style), # 文本风格(权重,可自行微调)
]style_vectors = []
for weight, source in weighted_styles:if isinstance(source, str):  style_vec = sys.embed_style(source)else: style_vec = sys.embed_style(source)style_vectors.append(style_vec)style_vectors = np.array(style_vectors)
weights = np.array([w for w, _ in weighted_styles])weights_norm = weights / weights.sum()
blended_style = (weights_norm[:, np.newaxis] * style_vectors).sum(axis=0)num_seconds = 10 
chunks = []
state = Nonenum_chunks = round(num_seconds / mrt.config.chunk_length)for i in range(num_chunks):chunk, state = mrt.generate_chunk(state=state)chunks.append(chunk)generated_audio = audio.concatenate(chunks, crossfade_time=mrt.crossfade_length)display(Audio(generated_audio.samples.swapaxes(0, 1), rate=mrt.sample_rate))

6、SpectroStream 对音频进行分词

SpectroStream 是一个离散音频编解码器模型,用于处理高保真音乐音频(立体声,48kHz)。在底层,Magenta RT 模型使用语言模型对 SpectroStream 音频标记进行建模。

  • naudio_path:音频文件(mp3、wav等格式)
from magenta_rt import audio, spectrostreamcodec = spectrostream.SpectroStream()
audio_path = '/openbayes/home/1.wav' # 这里填写待音频分词的音频所在系统路径
my_audio = audio.Waveform.from_file(audio_path) 
my_tokens = codec.encode(my_audio)
my_audio_reconstruction = codec.decode(my_tokens)
display(Audio(my_audio_reconstruction.samples.swapaxes(0, 1), rate=mrt.sample_rate))
http://www.dtcms.com/a/284299.html

相关文章:

  • 【C++详解】STL-stack、queue的模拟实现,容器适配器,deque双端队列介绍
  • Java 大视界 -- Java 大数据在智能交通智能公交站台乘客流量预测与服务优化中的应用(349)
  • 19.删除链表的倒数第 N 个结点
  • 多线程--sem_wait(sem)特殊用法
  • 拿到安全工程师证后,能从事哪些岗位?
  • C函数实现strcopy strcat strcmp strstr
  • javax.servlet.http.HttpServletResponse;API导入报错解决方案
  • Kotlin集合与空值
  • 产品经理如何绘制流程图
  • Linux中的数据库操作基础
  • SpringMVC 执行原理
  • 79、【OS】【Nuttx】【启动】caller-saved 和 callee-saved 示例:r7 寄存器
  • Modbus
  • PyCharm2024安装包社区版和专业版
  • TESOLLO五指灵巧手遥操作解决方案
  • 使用 .NET Core 的原始 WebSocket
  • Spring整合MyBatis详解
  • 概率论与数理统计(四)
  • WCDB soci 查询语句
  • 缓存雪崩、缓存穿透,缓存击穿
  • 使用IntelliJ IDEA和Maven搭建SpringBoot集成Fastjson项目
  • 【git】使用教程
  • CommonJS和ES模块区别对比
  • API开发提速新方案:SmartBear API Hub与ReadyAPI虚拟化整合实践
  • ESP8266服务器建立TCP连接失败AT+CIPSTART=“TCP“,“192.168.124.1“,8080 ERROR CLOSED
  • JAVA后端开发——success(data) vs toAjax(rows): 何时用
  • 美拍sig逆向
  • 神经网络:模拟人脑的 AI 信息处理系统
  • 代码随想录打卡第十二天
  • Unity | AmplifyShaderEditor插件基础(第十集:噪声的种类+火焰制作-下)