当前位置：首页 > news >正文

本地进行语音文字互转

news 2025/10/2 11:20:44

文字转语音 ChatTTS

https://github.com/2noise/ChatTTS

用下面的代码即可实现，输出一个wav音频文件，即为转化的结果

from torch import manual_seedimport ChatTTS
import torch
import torchaudiochat = ChatTTS.Chat()
chat.load(compile=True) # Set to True for better performancetexts = ["今年上半年，各地因地制宜，加快建设各具特色的产业集群。"]# 设置种子, 保持音色稳定
torch.manual_seed(120)wavs = chat.infer(texts)for i in range(len(wavs)):"""In some versions of torchaudio, the first line works but in other versions, so does the second line."""try:torchaudio.save(f"basic_output{i}.wav", torch.from_numpy(wavs[i]).unsqueeze(0), 24000, format="wav")except:torchaudio.save(f"basic_output{i}.wav", torch.from_numpy(wavs[i]), 24000, format="wav")

需要安装soundfile来处理音频格式

pip install soundfile

RuntimeError: narrow(): length must be non-negative.

transformer版本问题导致的，运行下面命令即可解决

pip install transformers==4.53.2

https://github.com/2noise/ChatTTS/issues/955

语音转文字 whisper

https://github.com/openai/whisper

把上面转出来的wav文件，用whisper再转成文字，按照官方代码实例即可

http://www.dtcms.com/a/323463.html

相关文章：

国内外大模型体验与评测

Vue2 字段值映射通用方法

Python 属性描述符(描述符用法建议)

基于Prometheus、Grafana、Loki与Tempo的统一监控平台故障排查与解决方案

redis开启局域网访问

C++讲解---通过转换函数和运算符函数直接调用类的对象

Horse3D引擎研发笔记（三）：使用QtOpenGL的Shader编程绘制彩色三角形

Aurora设计注意问题

【递归、搜索和回溯】FloodFill 算法介绍及相关例题

11. 为什么要用static关键字

香橙派 RK3588 部署千问大模型 Qwen2-VL-2B 多轮交互式对话

【工具】Python多环境管理

ubuntu安装ollama流程

Day 8: 深度学习综合实战与进阶技术 - 从优化到部署的完整流程

Java+Vue打造的采购招投标一体化管理系统，涵盖招标、投标、开标、评标全流程，功能完备，附完整可二次开发的源码

数据结构day06

102-基于Spark的招聘数据预测分析推荐系统

物质和暗物质形成机制

【排序算法】④堆排序

工具箱许愿墙项目发布

AI_RAG

复现论文关于3-RPRU并联机器人运动学建模与参数优化设计

机器翻译实战：使用Gensim训练中英文词向量模型及可视化

Android之gradle和gradlew命令编译项目总结

消息队列核心功能和消息队列做异步的优势

C++：继承[下篇]

Vue 使用element plus组件库提示doesn‘t work properly without JavaScript enabled

[ MySQL 数据库 ] 多表关联查询

STM32HAL库 -- 10.DMA外设实战(UART串口+DMA读取传感器数据)

Tangram官网教程