当前位置：首页 > news >正文

基于VibeVoice搭建语音合成demo

news 2025/11/8 14:38:44

https://github.com/vibevoice-community/VibeVoice.git，该项目的优势是支持微调模型

一、环境配置

使用A800的机器，环境配置如下
在这里插入图片描述

二、安装过程

git clone https://github.com/vibevoice-community/VibeVoice.git
cd VibeVoiceapt update
apt install -y curl# 安装uv
curl -LsSf https://astral.sh/uv/install.sh | sh   
source $HOME/.local/bin/env# 创建环境
uv venv
source .venv/bin/activate
uv pip install -e . --index-url https://pypi.tuna.tsinghua.edu.cn/simple# 安装pip
python -m ensurepip --upgrade#查看python版本
python --version #显示：Python 3.12.7
#查看torch相关信息版本
import torch
print(torch._C._GLIBCXX_USE_CXX11_ABI) #显示：TRUE
print(torch.__version__) #显示：2.7.0+cu126
print(torch.version.cuda) #显示：12.6综合以上python版本信息和torch相关的版本信息，我应该选择cu12torch2.7cxx11abiTrue-cp312，
因此我下载的是下面的flash_attention.whl
# 安装torch
/root/private_data/work_space/VibeVoice/.venv/bin/pip3 install torch==2.7.0 torchvision==0.22.0 torchaudio==2.7.0 --index-url https://download.pytorch.org/whl/cu118 -i https://pypi.tuna.tsinghua.edu.cn/simple --timeout 3600# 安装flash_attention
# flash_attention的版本非常讲究，python的版本号、cuda的版本号、torch的版本号、cxx11abi是否为True都是有讲究的，要选对版本
/root/private_data/work_space/VibeVoice/.venv/bin/pip3 install flash_attn-2.8.2+cu12torch2.7cxx11abiTRUE-cp311-cp311-linux_x86_64.whl#安装ffmpeg
apt install -y ffmpeg# 启动gradio网页
python demo/gradio_demo.py --model_path /root/private_data/models/microsoft/VibeVoice-7B --device cuda --share --port 8080