当前位置：首页 > news >正文

【AI实践】本地部署ASR模型OpenAI Whisper

news 2025/8/15 11:39:59

上一篇部署了ComfyUI，【AI实践】本地部署ComfyUI-CSDN博客

打通了CUDA，torch等环境，部署其它的会方便很多。

直接复制comfyui的环境，免安装一些组件

conda env list
conda create -n whisper --clone rtx50_comfyui pip install PySocks win-inet-pton    
conda activate whisper 
cd .\workspace\   
set HTTP_PROXY=http://127.0.0.1:1080
set HTTPS_PROXY=http://127.0.0.1:1080
git config --global http.proxy http://127.0.0.1:1080
git config --global https.proxy https://127.0.0.1:1080        
pip install git cloe kshttps://github.com/openai/whisper.git 
conda install -c conda-forge ffmpeg

运行下test.py，内容如下

import whisper 
model = whisper.load_model("base").cuda() 
print(whisper.__version__)

运行下转录whisper_transcribe.py

import whisper # 加载模型（使用 GPU） 
model = whisper.load_model("base").cuda() # 转录示例音频 
result = model.transcribe("example.wav", language="zh") 
print(result["text"])

 python .\whisper_transcribe.py

录音转录

import sounddevice as sd
import numpy as np
import whisperdef record_audio(duration=5, samplerate=16000):print(f"正在录音 {duration} 秒...")audio = sd.rec(int(duration * samplerate), samplerate=samplerate, channels=1, dtype='float32')sd.wait()print("录音结束。")return np.squeeze(audio)def save_wav(filename, audio, samplerate=16000):import scipy.io.wavfilescipy.io.wavfile.write(filename, samplerate, (audio * 32767).astype(np.int16))def transcribe(audio, samplerate=16000):model = whisper.load_model("base")# Whisper 需要文件输入，先保存为 wavtemp_wav = "temp.wav"save_wav(temp_wav, audio, samplerate)result = model.transcribe(temp_wav, language="zh")return result["text"]if __name__ == "__main__":duration = 20  # 录音时长（秒）samplerate = 16000audio = record_audio(duration, samplerate)text = transcribe(audio, samplerate)print("识别结果：", text)

查看全文

http://www.dtcms.com/a/328709.html