当前位置: 首页 > news >正文

WhisperLiveKit上手及主观评测

项目介绍

项目地址:https://github.com/QuentinFuxa/WhisperLiveKit

本文旨在快速上手,搭建环境,做下模型服务的功能学习和简单主观评测。

  • 关键词:转录transcription,说话人分离diarization,机器翻译translation,语音活动检测vad
  • 目的:环境搭建,快速上手,主观快速评测
  • 难度:中;


 

WhisperLiveKit是一个实施转录工具,那为什么不直接使用Whisper呢。

从应用场景上先看,会议/直播/在线教育等这些场景,需要实时输出转录结果(就需要对小窗口的录音进行转录),甚至要进行多人说话时,说话人识别,还可能需要实时翻译同传。

而,Whisper 是为完整语句设计的,而不是实时片段。处理小片段会丢失上下文,截断音节中的单词,并产生糟糕的转录。WhisperLiveKit 使用最先进的同步语音研究进行智能缓冲和增量处理。

WhisperLiveKit基于历史上的若干Research的基础上,进行开发设计,包括:

  • SimulStreaming (SOTA 2025) - Ultra-low latency transcription using AlignAtt policy
    SimulStreaming (SOTA 2025) - 使用 AlignAtt 策略实现超低延迟转录
  • NLLB, (distilled) (2024) - Translation to more than 100 languages.
    NLLB,(精简版) (2024) - 翻译超过 100 种语言。
  • WhisperStreaming (SOTA 2023) - Low latency transcription using LocalAgreement policy
    WhisperStreaming (SOTA 2023) - 使用 LocalAgreement 策略的低延迟转录
  • Streaming Sortformer (SOTA 2025) - Advanced real-time speaker diarization
    流式 Sortformer(SOTA 2025)- 高级实时说话人分割
  • Diart (SOTA 2021) - Real-time speaker diarization
    Diart(SOTA 2021)- 实时说话人分割
  • Silero VAD (2024) - Enterprise-grade Voice Activity Detection
    Silero VAD(2024)- 企业级语音活动检测

这是项目的架构图,支持多用户并发

下面我们进行安装部署,作下上手简单评测

安装运行

环境部署

使用conda 创建隔离运行环境。考虑到我这边是RTX5090显卡+匹配的torch版本,我这边还是基于之前的whipser环境进行复制,生成新的conda环境。

(base) PS C:\Users\Jacob> conda env list# conda environments:
#
base                 * C:\Users\Jacob\miniconda3
fireredtts             C:\Users\Jacob\miniconda3\envs\fireredtts
pytorch_nightly_env    C:\Users\Jacob\miniconda3\envs\pytorch_nightly_env
qwen_rtx5090           C:\Users\Jacob\miniconda3\envs\qwen_rtx5090
rtx50_comfyui          C:\Users\Jacob\miniconda3\envs\rtx50_comfyui
whisper                C:\Users\Jacob\miniconda3\envs\whisper(base) PS C:\Users\Jacob> conda
usage: conda-script.py [-h] [-v] [--no-plugins] [-V] COMMAND ...conda is a tool for managing and deploying applications, environments and packages.options:-h, --help            Show this help message and exit.-v, --verbose         Can be used multiple times. Once for detailed output, twice for INFO logging, thrice for DEBUGlogging, four times for TRACE logging.--no-plugins          Disable all plugins that are not built into conda.-V, --version         Show the conda version number and exit.commands:The following built-in and plugins subcommands are available.COMMANDactivate            Activate a conda environment.clean               Remove unused packages and caches.commands            List all available conda subcommands (including those from plugins). Generally only used bytab-completion.compare             Compare packages between conda environments.config              Modify configuration values in .condarc.content-trust       Signing and verification tools for Condacreate              Create a new conda environment from a list of specified packages.deactivate          Deactivate the current active conda environment.doctor              Display a health report for your environment.env                 Create and manage conda environments.export              Export a given environmentinfo                Display information about current conda install.init                Initialize conda for shell interaction.install             Install a list of packages into a specified conda environment.list                List installed packages in a conda environment.notices             Retrieve latest channel notifications.package             Create low-level conda packages. (EXPERIMENTAL)remove (uninstall)  Remove a list of packages from a specified conda environment.rename              Rename an existing environment.repoquery           Advanced search for repodata.run                 Run an executable in a conda environment.search              Search for packages and display associated information using the MatchSpec format.tos                 A subcommand for viewing, accepting, rejecting, and otherwise interacting with a channel'sTerms of Service (ToS). This plugin periodically checks for updated Terms of Service for theactive/selected channels. Channels with a Terms of Service will need to be accepted orrejected prior to use. Conda will only allow package installation from channels without aTerms of Service or with an accepted Terms of Service. Attempting to use a channel with arejected Terms of Service will result in an error.update (upgrade)    Update conda packages to the latest compatible version.
(base) PS C:\Users\Jacob> conda create -n whisperlivekit --clone whisper
3 channel Terms of Service accepted
Retrieving notices: done
Source:      C:\Users\Jacob\miniconda3\envs\whisper
Destination: C:\Users\Jacob\miniconda3\envs\whisperlivekit
Packages: 19
Files: 35845Downloading and Extracting Packages:## Package Plan ##environment location: C:\Users\Jacob\miniconda3\envs\whisperlivekitadded / updated specs:- conda-forge/noarch::ca-certificates==2025.8.3=h4c7d964_0- conda-forge/win-64::ffmpeg==4.3.1=ha925a31_0- conda-forge/win-64::openssl==3.5.2=h725018a_0- defaults/noarch::pip==25.1=pyhc872135_2- defaults/noarch::tzdata==2025b=h04d1e81_0- defaults/win-64::bzip2==1.0.8=h2bbff1b_6- defaults/win-64::expat==2.7.1=h8ddb27b_0- defaults/win-64::libffi==3.4.4=hd77b12b_1- defaults/win-64::python==3.12.11=h716150d_0- defaults/win-64::setuptools==78.1.1=py312haa95532_0- defaults/win-64::sqlite==3.50.2=hda9a48d_1- defaults/win-64::tk==8.6.14=h5e9d12e_1- defaults/win-64::ucrt==10.0.22621.0=haa95532_0- defaults/win-64::vc14_runtime==14.44.35208=h4927774_10- defaults/win-64::vc==14.3=h2df5915_10- defaults/win-64::vs2015_runtime==14.44.35208=ha6b5a95_10- defaults/win-64::wheel==0.45.1=py312haa95532_0- defaults/win-64::xz==5.6.4=h4754444_1- defaults/win-64::zlib==1.2.13=h8cc25b3_1done
#
# To activate this environment, use
#
#     $ conda activate whisperlivekit
#
# To deactivate an active environment, use
#
#     $ conda deactivate(base) PS C:\Users\Jacob> conda activate whisperlivekit
(whisperlivekit) PS C:\Users\Jacob> pip install whisperlivekit
Collecting whisperlivekitDownloading whisperlivekit-0.2.9-py3-none-any.whl.metadata (18 kB)
Collecting fastapi (from whisperlivekit)Downloading fastapi-0.117.1-py3-none-any.whl.metadata (28 kB)
Collecting librosa (from whisperlivekit)Downloading librosa-0.11.0-py3-none-any.whl.metadata (8.7 kB)
Requirement already satisfied: soundfile in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (0.13.1)
Collecting faster-whisper (from whisperlivekit)Downloading faster_whisper-1.2.0-py3-none-any.whl.metadata (16 kB)
Collecting uvicorn (from whisperlivekit)Downloading uvicorn-0.36.0-py3-none-any.whl.metadata (6.6 kB)
Collecting websockets (from whisperlivekit)Downloading websockets-15.0.1-cp312-cp312-win_amd64.whl.metadata (7.0 kB)
Requirement already satisfied: torchaudio>=2.0.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (2.8.0.dev20250810+cu128)
Requirement already satisfied: torch>=2.0.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (2.9.0.dev20250810+cu128)
Requirement already satisfied: tqdm in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (4.67.1)
Requirement already satisfied: tiktoken in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (0.11.0)
Requirement already satisfied: filelock in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.18.0)
Requirement already satisfied: typing-extensions>=4.10.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (4.14.1)
Requirement already satisfied: sympy>=1.13.3 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (1.14.0)
Requirement already satisfied: networkx>=2.5.1 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.5)
Requirement already satisfied: jinja2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.1.6)
Requirement already satisfied: fsspec>=0.8.5 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (2025.7.0)
Requirement already satisfied: setuptools in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (78.1.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from sympy>=1.13.3->torch>=2.0.0->whisperlivekit) (1.3.0)
Collecting starlette<0.49.0,>=0.40.0 (from fastapi->whisperlivekit)Downloading starlette-0.48.0-py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from fastapi->whisperlivekit) (2.11.7)
Requirement already satisfied: annotated-types>=0.6.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (2.33.2)
Requirement already satisfied: typing-inspection>=0.4.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (0.4.1)
Collecting anyio<5,>=3.6.2 (from starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit)Downloading anyio-4.10.0-py3-none-any.whl.metadata (4.0 kB)
Requirement already satisfied: idna>=2.8 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from anyio<5,>=3.6.2->starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit) (3.10)
Collecting sniffio>=1.1 (from anyio<5,>=3.6.2->starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit)Downloading sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting ctranslate2<5,>=4.0 (from faster-whisper->whisperlivekit)Downloading ctranslate2-4.6.0-cp312-cp312-win_amd64.whl.metadata (10 kB)
Requirement already satisfied: huggingface-hub>=0.13 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (0.34.4)
Requirement already satisfied: tokenizers<1,>=0.13 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (0.21.4)
Collecting onnxruntime<2,>=1.14 (from faster-whisper->whisperlivekit)Downloading onnxruntime-1.22.1-cp312-cp312-win_amd64.whl.metadata (5.1 kB)
Requirement already satisfied: av>=11 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (15.0.0)
Requirement already satisfied: numpy in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from ctranslate2<5,>=4.0->faster-whisper->whisperlivekit) (2.2.6)
Requirement already satisfied: pyyaml<7,>=5.3 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from ctranslate2<5,>=4.0->faster-whisper->whisperlivekit) (6.0.2)
Collecting coloredlogs (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting flatbuffers (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Downloading flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes)
Requirement already satisfied: packaging in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit) (25.0)
Collecting protobuf (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Downloading protobuf-6.32.1-cp310-abi3-win_amd64.whl.metadata (593 bytes)
Requirement already satisfied: requests in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2.32.4)
Requirement already satisfied: colorama in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from tqdm->whisperlivekit) (0.4.6)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
Collecting pyreadline3 (from humanfriendly>=9.1->coloredlogs->onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Downloading pyreadline3-3.5.4-py3-none-any.whl.metadata (4.7 kB)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from jinja2->torch>=2.0.0->whisperlivekit) (3.0.2)
Collecting audioread>=2.1.9 (from librosa->whisperlivekit)Using cached audioread-3.0.1-py3-none-any.whl.metadata (8.4 kB)
Requirement already satisfied: numba>=0.51.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from librosa->whisperlivekit) (0.61.2)
Requirement already satisfied: scipy>=1.6.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from librosa->whisperlivekit) (1.16.1)
Collecting scikit-learn>=1.1.0 (from librosa->whisperlivekit)Using cached scikit_learn-1.7.2-cp312-cp312-win_amd64.whl.metadata (11 kB)
Collecting joblib>=1.0 (from librosa->whisperlivekit)Using cached joblib-1.5.2-py3-none-any.whl.metadata (5.6 kB)
Collecting decorator>=4.3.0 (from librosa->whisperlivekit)Using cached decorator-5.2.1-py3-none-any.whl.metadata (3.9 kB)
Collecting pooch>=1.1 (from librosa->whisperlivekit)Using cached pooch-1.8.2-py3-none-any.whl.metadata (10 kB)
Collecting soxr>=0.3.2 (from librosa->whisperlivekit)Using cached soxr-1.0.0-cp312-abi3-win_amd64.whl.metadata (5.6 kB)
Collecting lazy_loader>=0.1 (from librosa->whisperlivekit)Using cached lazy_loader-0.4-py3-none-any.whl.metadata (7.6 kB)
Collecting msgpack>=1.0 (from librosa->whisperlivekit)Using cached msgpack-1.1.1-cp312-cp312-win_amd64.whl.metadata (8.6 kB)
Requirement already satisfied: llvmlite<0.45,>=0.44.0dev0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from numba>=0.51.0->librosa->whisperlivekit) (0.44.0)
Collecting platformdirs>=2.5.0 (from pooch>=1.1->librosa->whisperlivekit)Using cached platformdirs-4.4.0-py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: charset_normalizer<4,>=2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (3.4.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2025.8.3)
Collecting threadpoolctl>=3.1.0 (from scikit-learn>=1.1.0->librosa->whisperlivekit)Using cached threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: cffi>=1.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from soundfile->whisperlivekit) (1.17.1)
Requirement already satisfied: pycparser in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from cffi>=1.0->soundfile->whisperlivekit) (2.22)
Requirement already satisfied: regex>=2022.1.18 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from tiktoken->whisperlivekit) (2025.7.34)
Collecting click>=7.0 (from uvicorn->whisperlivekit)Downloading click-8.3.0-py3-none-any.whl.metadata (2.6 kB)
Collecting h11>=0.8 (from uvicorn->whisperlivekit)Downloading h11-0.16.0-py3-none-any.whl.metadata (8.3 kB)
Downloading whisperlivekit-0.2.9-py3-none-any.whl (876 kB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 876.5/876.5 kB 163.8 kB/s eta 0:00:00
Downloading fastapi-0.117.1-py3-none-any.whl (95 kB)
Downloading starlette-0.48.0-py3-none-any.whl (73 kB)
Downloading anyio-4.10.0-py3-none-any.whl (107 kB)
Downloading sniffio-1.3.1-py3-none-any.whl (10 kB)
Downloading faster_whisper-1.2.0-py3-none-any.whl (1.1 MB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.1/1.1 MB 124.6 kB/s eta 0:00:00
Downloading ctranslate2-4.6.0-cp312-cp312-win_amd64.whl (19.5 MB)━━━━━━━━━━╸━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.2/19.5 MB 124.3 kB/s eta 0:01:55
ERROR: Operation cancelled by user
(whisperlivekit) PS C:\Users\Jacob> pip install whisperlivekit
Collecting whisperlivekitUsing cached whisperlivekit-0.2.9-py3-none-any.whl.metadata (18 kB)
Collecting fastapi (from whisperlivekit)Using cached fastapi-0.117.1-py3-none-any.whl.metadata (28 kB)
Collecting librosa (from whisperlivekit)Using cached librosa-0.11.0-py3-none-any.whl.metadata (8.7 kB)
Requirement already satisfied: soundfile in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (0.13.1)
Collecting faster-whisper (from whisperlivekit)Using cached faster_whisper-1.2.0-py3-none-any.whl.metadata (16 kB)
Collecting uvicorn (from whisperlivekit)Using cached uvicorn-0.36.0-py3-none-any.whl.metadata (6.6 kB)
Collecting websockets (from whisperlivekit)Using cached websockets-15.0.1-cp312-cp312-win_amd64.whl.metadata (7.0 kB)
Requirement already satisfied: torchaudio>=2.0.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (2.8.0.dev20250810+cu128)
Requirement already satisfied: torch>=2.0.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (2.9.0.dev20250810+cu128)
Requirement already satisfied: tqdm in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (4.67.1)
Requirement already satisfied: tiktoken in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from whisperlivekit) (0.11.0)
Requirement already satisfied: filelock in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.18.0)
Requirement already satisfied: typing-extensions>=4.10.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (4.14.1)
Requirement already satisfied: sympy>=1.13.3 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (1.14.0)
Requirement already satisfied: networkx>=2.5.1 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.5)
Requirement already satisfied: jinja2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (3.1.6)
Requirement already satisfied: fsspec>=0.8.5 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (2025.7.0)
Requirement already satisfied: setuptools in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from torch>=2.0.0->whisperlivekit) (78.1.1)
Requirement already satisfied: mpmath<1.4,>=1.1.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from sympy>=1.13.3->torch>=2.0.0->whisperlivekit) (1.3.0)
Collecting starlette<0.49.0,>=0.40.0 (from fastapi->whisperlivekit)Using cached starlette-0.48.0-py3-none-any.whl.metadata (6.3 kB)
Requirement already satisfied: pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from fastapi->whisperlivekit) (2.11.7)
Requirement already satisfied: annotated-types>=0.6.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (0.7.0)
Requirement already satisfied: pydantic-core==2.33.2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (2.33.2)
Requirement already satisfied: typing-inspection>=0.4.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from pydantic!=1.8,!=1.8.1,!=2.0.0,!=2.0.1,!=2.1.0,<3.0.0,>=1.7.4->fastapi->whisperlivekit) (0.4.1)
Collecting anyio<5,>=3.6.2 (from starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit)Using cached anyio-4.10.0-py3-none-any.whl.metadata (4.0 kB)
Requirement already satisfied: idna>=2.8 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from anyio<5,>=3.6.2->starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit) (3.10)
Collecting sniffio>=1.1 (from anyio<5,>=3.6.2->starlette<0.49.0,>=0.40.0->fastapi->whisperlivekit)Using cached sniffio-1.3.1-py3-none-any.whl.metadata (3.9 kB)
Collecting ctranslate2<5,>=4.0 (from faster-whisper->whisperlivekit)Using cached ctranslate2-4.6.0-cp312-cp312-win_amd64.whl.metadata (10 kB)
Requirement already satisfied: huggingface-hub>=0.13 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (0.34.4)
Requirement already satisfied: tokenizers<1,>=0.13 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (0.21.4)
Collecting onnxruntime<2,>=1.14 (from faster-whisper->whisperlivekit)Using cached onnxruntime-1.22.1-cp312-cp312-win_amd64.whl.metadata (5.1 kB)
Requirement already satisfied: av>=11 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from faster-whisper->whisperlivekit) (15.0.0)
Requirement already satisfied: numpy in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from ctranslate2<5,>=4.0->faster-whisper->whisperlivekit) (2.2.6)
Requirement already satisfied: pyyaml<7,>=5.3 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from ctranslate2<5,>=4.0->faster-whisper->whisperlivekit) (6.0.2)
Collecting coloredlogs (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Using cached coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting flatbuffers (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Using cached flatbuffers-25.2.10-py2.py3-none-any.whl.metadata (875 bytes)
Requirement already satisfied: packaging in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit) (25.0)
Collecting protobuf (from onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Using cached protobuf-6.32.1-cp310-abi3-win_amd64.whl.metadata (593 bytes)
Requirement already satisfied: requests in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2.32.4)
Requirement already satisfied: colorama in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from tqdm->whisperlivekit) (0.4.6)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Using cached humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
Collecting pyreadline3 (from humanfriendly>=9.1->coloredlogs->onnxruntime<2,>=1.14->faster-whisper->whisperlivekit)Using cached pyreadline3-3.5.4-py3-none-any.whl.metadata (4.7 kB)
Requirement already satisfied: MarkupSafe>=2.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from jinja2->torch>=2.0.0->whisperlivekit) (3.0.2)
Collecting audioread>=2.1.9 (from librosa->whisperlivekit)Using cached audioread-3.0.1-py3-none-any.whl.metadata (8.4 kB)
Requirement already satisfied: numba>=0.51.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from librosa->whisperlivekit) (0.61.2)
Requirement already satisfied: scipy>=1.6.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from librosa->whisperlivekit) (1.16.1)
Collecting scikit-learn>=1.1.0 (from librosa->whisperlivekit)Using cached scikit_learn-1.7.2-cp312-cp312-win_amd64.whl.metadata (11 kB)
Collecting joblib>=1.0 (from librosa->whisperlivekit)Using cached joblib-1.5.2-py3-none-any.whl.metadata (5.6 kB)
Collecting decorator>=4.3.0 (from librosa->whisperlivekit)Using cached decorator-5.2.1-py3-none-any.whl.metadata (3.9 kB)
Collecting pooch>=1.1 (from librosa->whisperlivekit)Using cached pooch-1.8.2-py3-none-any.whl.metadata (10 kB)
Collecting soxr>=0.3.2 (from librosa->whisperlivekit)Using cached soxr-1.0.0-cp312-abi3-win_amd64.whl.metadata (5.6 kB)
Collecting lazy_loader>=0.1 (from librosa->whisperlivekit)Using cached lazy_loader-0.4-py3-none-any.whl.metadata (7.6 kB)
Collecting msgpack>=1.0 (from librosa->whisperlivekit)Using cached msgpack-1.1.1-cp312-cp312-win_amd64.whl.metadata (8.6 kB)
Requirement already satisfied: llvmlite<0.45,>=0.44.0dev0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from numba>=0.51.0->librosa->whisperlivekit) (0.44.0)
Collecting platformdirs>=2.5.0 (from pooch>=1.1->librosa->whisperlivekit)Using cached platformdirs-4.4.0-py3-none-any.whl.metadata (12 kB)
Requirement already satisfied: charset_normalizer<4,>=2 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (3.4.3)
Requirement already satisfied: urllib3<3,>=1.21.1 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2.5.0)
Requirement already satisfied: certifi>=2017.4.17 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from requests->huggingface-hub>=0.13->faster-whisper->whisperlivekit) (2025.8.3)
Collecting threadpoolctl>=3.1.0 (from scikit-learn>=1.1.0->librosa->whisperlivekit)Using cached threadpoolctl-3.6.0-py3-none-any.whl.metadata (13 kB)
Requirement already satisfied: cffi>=1.0 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from soundfile->whisperlivekit) (1.17.1)
Requirement already satisfied: pycparser in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from cffi>=1.0->soundfile->whisperlivekit) (2.22)
Requirement already satisfied: regex>=2022.1.18 in c:\users\jacob\miniconda3\envs\whisperlivekit\lib\site-packages (from tiktoken->whisperlivekit) (2025.7.34)
Collecting click>=7.0 (from uvicorn->whisperlivekit)Using cached click-8.3.0-py3-none-any.whl.metadata (2.6 kB)
Collecting h11>=0.8 (from uvicorn->whisperlivekit)Using cached h11-0.16.0-py3-none-any.whl.metadata (8.3 kB)
Using cached whisperlivekit-0.2.9-py3-none-any.whl (876 kB)
Using cached fastapi-0.117.1-py3-none-any.whl (95 kB)
Using cached starlette-0.48.0-py3-none-any.whl (73 kB)
Using cached anyio-4.10.0-py3-none-any.whl (107 kB)
Using cached sniffio-1.3.1-py3-none-any.whl (10 kB)
Using cached faster_whisper-1.2.0-py3-none-any.whl (1.1 MB)
Downloading ctranslate2-4.6.0-cp312-cp312-win_amd64.whl (19.5 MB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.5/19.5 MB 8.4 MB/s eta 0:00:00
Downloading onnxruntime-1.22.1-cp312-cp312-win_amd64.whl (12.7 MB)━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 12.7/12.7 MB 12.0 MB/s eta 0:00:00
Downloading coloredlogs-15.0.1-py2.py3-none-any.whl (46 kB)
Downloading humanfriendly-10.0-py2.py3-none-any.whl (86 kB)
Downloading flatbuffers-25.2.10-py2.py3-none-any.whl (30 kB)
Downloading librosa-0.11.0-py3-none-any.whl (260 kB)
Using cached audioread-3.0.1-py3-none-any.whl (23 kB)
Using cached decorator-5.2.1-py3-none-any.whl (9.2 kB)
Using cached joblib-1.5.2-py3-none-any.whl (308 kB)
Using cached lazy_loader-0.4-py3-none-any.whl (12 kB)
Using cached msgpack-1.1.1-cp312-cp312-win_amd64.whl (72 kB)
Using cached pooch-1.8.2-py3-none-any.whl (64 kB)
Using cached platformdirs-4.4.0-py3-none-any.whl (18 kB)
Using cached scikit_learn-1.7.2-cp312-cp312-win_amd64.whl (8.7 MB)
Using cached soxr-1.0.0-cp312-abi3-win_amd64.whl (172 kB)
Using cached threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Downloading protobuf-6.32.1-cp310-abi3-win_amd64.whl (435 kB)
Downloading pyreadline3-3.5.4-py3-none-any.whl (83 kB)
Downloading uvicorn-0.36.0-py3-none-any.whl (67 kB)
Downloading click-8.3.0-py3-none-any.whl (107 kB)
Downloading h11-0.16.0-py3-none-any.whl (37 kB)
Downloading websockets-15.0.1-cp312-cp312-win_amd64.whl (176 kB)
Installing collected packages: flatbuffers, websockets, threadpoolctl, soxr, sniffio, pyreadline3, protobuf, platformdirs, msgpack, lazy_loader, joblib, h11, decorator, ctranslate2, click, audioread, uvicorn, scikit-learn, pooch, humanfriendly, anyio, starlette, librosa, coloredlogs, onnxruntime, fastapi, faster-whisper, whisperlivekit
Successfully installed anyio-4.10.0 audioread-3.0.1 click-8.3.0 coloredlogs-15.0.1 ctranslate2-4.6.0 decorator-5.2.1 fastapi-0.117.1 faster-whisper-1.2.0 flatbuffers-25.2.10 h11-0.16.0 humanfriendly-10.0 joblib-1.5.2 lazy_loader-0.4 librosa-0.11.0 msgpack-1.1.1 onnxruntime-1.22.1 platformdirs-4.4.0 pooch-1.8.2 protobuf-6.32.1 pyreadline3-3.5.4 scikit-learn-1.7.2 sniffio-1.3.1 soxr-1.0.0 starlette-0.48.0 threadpoolctl-3.6.0 uvicorn-0.36.0 websockets-15.0.1 whisperlivekit-0.2.9

全部安装完成,没有报错。

运行

开始运行,根据你使用的模型大小(下面的例子都是base模型),下面日志,有2次运行命令,

第一次输入错误的语言参数代号,中文我写了ch,应该为zh,我们先用英文测试,参数代号en

(whisperlivekit) PS C:\Users\Jacob> whisperlivekit-server --model base --language ch
INFO:     Started server process [7696]
INFO:     Waiting for application startup.
WARNING:whisperlivekit.basic_server:
==================================================
WhisperLiveKit 0.2.8 has introduced a new fast encoder feature using MLX Whisper or Faster Whisper for improved speed. Use --disable-fast-encoder to disable if you encounter issues.
==================================================C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\torch\hub.py:330: UserWarning: You are about to download and run code from an untrusted repository. In a future release, this won't be allowed. To add the repository to your trusted list, change the command to {calling_fn}(..., trust_repo=False) and a command prompt will appear asking for an explicit confirmation of trust, or load(..., trust_repo=True), which will assume that the prompt is to be answered with 'yes'. You can also use load(..., trust_repo='check') which will only prompt for confirmation if the repo is not already trusted. This will eventually be the default behaviourwarnings.warn(
Downloading: "https://github.com/snakers4/silero-vad/zipball/master" to C:\Users\Jacob/.cache\torch\hub\master.zip
WARNING:whisperlivekit.simul_whisper.backend:
SimulStreaming backend is dual-licensed:
• Non-Commercial Use: PolyForm Noncommercial License 1.0.0.
• Commercial Use: Check SimulStreaming README (github.com/ufal/SimulStreaming) for more details.Simulstreaming will use Faster Whisper for the encoder.
config.json: 2.31kB [00:00, 11.4kB/s]
C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\huggingface_hub\file_download.py:143: UserWarning: `huggingface_hub` cache-system uses symlinks by default to efficiently store duplicated files but your machine does not support them in C:\Users\Jacob\.cache\huggingface\hub\models--Systran--faster-whisper-base. Caching files will still work but in a degraded version that might require more space on your disk. This warning can be disabled by setting the `HF_HUB_DISABLE_SYMLINKS_WARNING` environment variable. For more details, see https://huggingface.co/docs/huggingface_hub/how-to-cache#limitations.
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-developmentwarnings.warn(message)
vocabulary.txt: 460kB [00:00, 637kB/s]
tokenizer.json: 0.00B [00:00, ?B/s]   Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
WARNING:huggingface_hub.file_download:Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
tokenizer.json: 2.20MB [00:01, 1.75MB/s]
model.bin: 100%|████████████████████████████████████████████████████████████████████| 145M/145M [00:11<00:00, 12.9MB/s]
100%|███████████████████████████████████████| 139M/139M [00:10<00:00, 13.9MiB/s]
Downloading warmup file from https://github.com/ggerganov/whisper.cpp/raw/master/samples/jfk.wav
ERROR:    Traceback (most recent call last):File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\starlette\routing.py", line 694, in lifespanasync with self.lifespan_context(app) as maybe_state:^^^^^^^^^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\contextlib.py", line 210, in __aenter__return await anext(self.gen)^^^^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\basic_server.py", line 32, in lifespantranscription_engine = TranscriptionEngine(^^^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\core.py", line 113, in __init__self.asr = SimulStreamingASR(^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\backend.py", line 312, in __init__self.models = [self.load_model() for i in range(self.preload_model_count)]^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\backend.py", line 321, in load_modeltemp_model = PaddedAlignAttWhisper(^^^^^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\simul_whisper.py", line 80, in __init__self.create_tokenizer(cfg.language if cfg.language != "auto" else None)File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\simul_whisper.py", line 190, in create_tokenizerself.tokenizer = tokenizer.get_tokenizer(^^^^^^^^^^^^^^^^^^^^^^^^File "C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\whisper\tokenizer.py", line 380, in get_tokenizerraise ValueError(f"Unsupported language: {language}")
ValueError: Unsupported language: chERROR:    Application startup failed. Exiting.(whisperlivekit) PS C:\Users\Jacob> whisperlivekit-server --model base --language en
INFO:     Started server process [2764]
INFO:     Waiting for application startup.
WARNING:whisperlivekit.basic_server:
==================================================
WhisperLiveKit 0.2.8 has introduced a new fast encoder feature using MLX Whisper or Faster Whisper for improved speed. Use --disable-fast-encoder to disable if you encounter issues.
==================================================Using cache found in C:\Users\Jacob/.cache\torch\hub\snakers4_silero-vad_master
WARNING:whisperlivekit.simul_whisper.backend:
SimulStreaming backend is dual-licensed:
• Non-Commercial Use: PolyForm Noncommercial License 1.0.0.
• Commercial Use: Check SimulStreaming README (github.com/ufal/SimulStreaming) for more details.Simulstreaming will use Faster Whisper for the encoder.
C:\Users\Jacob\miniconda3\envs\whisperlivekit\Lib\site-packages\whisperlivekit\simul_whisper\whisper\timing.py:42: UserWarning: Failed to launch Triton kernels, likely due to missing CUDA toolkit; falling back to a slower median kernel implementation...warnings.warn(
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
INFO:     ::1:50387 - "GET / HTTP/1.1" 200 OK
INFO:     ::1:50387 - "GET /favicon.ico HTTP/1.1" 404 Not Found
INFO:     ::1:51078 - "GET / HTTP/1.1" 200 OK
INFO:     ::1:51084 - "WebSocket /asr" [accepted]
INFO:whisperlivekit.basic_server:WebSocket connection opened.
INFO:     connection open
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.60s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.55s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.49s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.55s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.49s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.55s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=2.17s | + Silence of = 2.16s | last_end = 2.12 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=2.17s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.01s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.01s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=3.79s | + Silence of = 4.32s | last_end = 6.64 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=3.79s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.49s | + Silence of = 1.56s | last_end = 12.88 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.49s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=1.09s | + Silence of = 2.64s | last_end = 12.88 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=1.09s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s | + Silence of = 1.56s | last_end = 20.020000000000003 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s | + Silence of = 1.08s | last_end = 24.1 |
INFO:whisperlivekit.audio_processor:internal_buffer=0.00s | lag=0.00s |

首次运行会下载缓存文件,查看log日志

Based on the log, the program first downloaded the following components during its initial run:

  • Silero-VAD: A voice activity detection model from a GitHub repository.
  • Faster-Whisper-Base Model: A more efficient version of the Whisper model for transcription, including its associated files:
    • config.json
    • vocabulary.txt
    • tokenizer.json
    • model.bin
  • Warmup file: A sample audio file (jfk.wav) from the whisper.cpp repository, used to test and initialize the model.

下载的文件主要保存在两个地方:

  • Hugging Face 缓存目录

大部分模型文件,比如 Faster-Whisper-Baseconfig.jsonvocabulary.txttokenizer.jsonmodel.bin 都被下载到了 Hugging Face 的默认缓存目录。

  • 路径:C:\Users\Jacob\.cache\huggingface\hub

这个缓存目录通常用于存储从 Hugging Face Hub 下载的模型和数据集。

  • Torch Hub 缓存目录

Silero-VAD 模型是从 GitHub 下载的,它被保存在 Torch Hub 的缓存目录中。

  • 路径:C:\Users\Jacob\.cache\torch\hub\master.zip

启动:whisperlivekit-server --model base --language en,使用标准模型,识别en英语;

打开网站:http://localhost:8000/

可以看到,它使用websocket和后台backend进行通信。是一个比较简单simple的一个前端页面。说话人有解析,它会把结果分成段。没有识别说话人(默认参数)

为了进一步测试,我们需要对模型服务进行配置。

模型配置

我们需要了解命令的每一个参数,可以先看https://github.com/QuentinFuxa/WhisperLiveKit的介绍。

其中,关于模型大小的配置,提供了很多模型,中低端的模型只有en语种

https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/available_models.md

其中,关于语种的参数如下

https://github.com/QuentinFuxa/WhisperLiveKit/blob/main/whisperlivekit/simul_whisper/whisper/tokenizer.py

总结

  • 项目成熟度不高,不过属于快速开发过程中,可以看release日志https://github.com/QuentinFuxa/WhisperLiveKit/releases
  • 前端界面比较简洁,近作示例,可以结合业务场景,自行开发前端,github中有前段示例
  • 说话人识别默认不生效,需要配置打开(whisperlivekit-server --model base --language en --diarization),试了下英文男女博客还是比较容易区分的,试了中文的2个女记者,基本无法识别说话人了。

  • 实时转录延迟大约1s左右
  • 准确率看了下,个别常见词也会出错(相近的发音单词)
  • 中文识别会默认输出繁体字,识别语言不配置,会自动识别出英文中文,并自动转录;
  • 资源占用:RTX5090 GPU占用在10%左右;GPU内存占用20GB;

第一次运行base模型的运行跑了10分钟以上,比较稳定

第二/三次运行,GPU内存9GB到20GB,再到30GB左右,跑了一分钟或者几分钟左右,GPU到了100%,开始大面积丢识别结果,功能异常。

http://www.dtcms.com/a/398583.html

相关文章:

  • iOS 26 系统流畅度深度评测 Liquid Glass 动画滑动卡顿、响应延迟、机型差异与 uni-app 优化策略
  • 逻辑回归(四):从原理到实战-训练,评估与应用指南
  • 【浅谈Spark和Flink区别及应用】
  • wordpress网站投放广告什么叫静态网站
  • 网上购物网站建设方案高端营销网站定制
  • 双目深度相机--2.sgm算法的匹配代价计算的方法详细介绍
  • 咨询聊城做网站深圳个人网站制作
  • GitHub 热榜项目 - 日榜(2025-09-23)
  • 【Linux系统】—— 进程切换进程优先级进程调度
  • vue使用html-docx基于TinyMCE 导出Word 文档
  • 衡水做网站的东莞百度网站推广
  • 五十三、bean的管理-bean的获取、bean的作用域、第三方bean
  • 开封网站开发公司百度福州分公司
  • VGG改进(10):将Dynamic Conv Attention引入VGG16完整指南
  • sql题目
  • 数字化转型的核心引擎:解读华为“业务重构”三层设计模型
  • 【算法】【优选算法】BFS 解决边权相同最短路问题
  • Socket基础
  • 深入了解linux网络—— 网络编程基础
  • 焦作做网站哪家好提供微网站制作电话
  • 【嘉力创】天线阻抗设计
  • xlsx-js-style 操作 Excel 文件样式
  • 岛屿数量(广搜)
  • 美食网站要怎么做一个网站交互怎么做的
  • AppInventor2 使用 SQLite(二)导入外部库文件
  • AppGallery Connect(Harmony0S 5及以上)--公开测试流程
  • 深入解析:使用递归计算整数幂的C语言实现
  • 虚幻引擎入门教程开关门
  • 设计模式-组合模式详解
  • 什么是B域?