当前位置: 首页 > news >正文

使用新版本cpu编译vLLM cpu端(编译完成,但是SCNet公网转发8000端口后,连不上)

安装python3.12版本

本身系统里有python3.10版本,也有vLLM,而且是DCU vLLM版本,但是需要用到python3.12版本,所以才重新弄整个系统。

为什么要用到python3.12版本? 这里没记,一段时间之后就忘记了。看来不能因为当时记得请,就不记!

一条命令安装pyenv

curl https://pyenv.run | bash
# 或者
curl -fsSL https://pyenv.run | bash

 把下面3句写入.bashrc文件里面:

export PYENV_ROOT="$HOME/.pyenv"
[[ -d $PYENV_ROOT/bin ]] && export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init - bash)"

pyenv安装python3.12.9

pyenv install 3.12.9

注意添加pyenv的加速镜像,否则很慢

export PYTHON_BUILD_MIRROR_URL="https://registry.npmmirror.com/-/binary/python"
export PYTHON_BUILD_MIRROR_URL_SKIP_CHECKSUM=1

设定使用python3.12.9 

全局使用设定
pyenv global 3.12.9
目录使用设定

设定在某个目录下使用版本,进入目录进行设定:

pyenv local 3.12.9

比如到 ~/private_data目录里,这行这句,就可以设定为在这个目录中,都是使用python3.12版本

下载vLLM代码

git clone https://github.com/vllm-project/vllm.git vllm_source
cd vllm_source

pip安装所需要的包

pip install --upgrade pip
pip install "cmake>=3.26" wheel packaging ninja "setuptools-scm>=8" numpy
pip install -v -r requirements/cpu.txt --extra-index-url https://download.pytorch.org/whl/cpu

编译

VLLM_TARGET_DEVICE=cpu python setup.py install

编译完成

root@notebook-1982767703556689922-ac7sc1ejvp-81431:~/private_data/github/vllm_source# pip show vllm
Name: vllm
Version: 0.11.1rc6.dev53+g0ce743f4e.cpu
Summary: A high-throughput and memory-efficient inference and serving engine for LLMs
Home-page: https://github.com/vllm-project/vllm
Author: vLLM Team
Author-email:
License-Expression: Apache-2.0
Location: /root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg
Requires: aiohttp, anthropic, blake3, cachetools, cbor2, cloudpickle, compressed-tensors, datasets, depyf, diskcache, einops, fastapi, filelock, gguf, intel-openmp, intel_extension_for_pytorch, lark, llguidance, lm-format-enforcer, mistral_common, msgspec, ninja, numba, numpy, openai, openai-harmony, opencv-python-headless, outlines_core, packaging, partial-json-parser, pillow, prometheus-fastapi-instrumentator, prometheus_client, protobuf, psutil, py-cpuinfo, pybase64, pydantic, python-json-logger, pyyaml, pyzmq, regex, requests, scipy, sentencepiece, setproctitle, setuptools, six, tiktoken, tokenizers, torch, torchaudio, torchvision, tqdm, transformers, triton, typing_extensions, watchfiles, xgrammar
Required-by:

下载个模型测试一下

VLLM_USE_MODELSCOPE=true vllm serve baidu/ERNIE-4.5-0.3B-PT --trust-remote-code

失败,有报错

(APIServer pid=9856) 2025-11-03 19:54:05,648 - modelscope - WARNING - Repo baidu/ERNIE-4.5-0.3B-PT not exists on https://www.modelscope.cn, will try on alternative endpoint https://www.modelscope.ai.
(APIServer pid=9856) 2025-11-03 19:54:06,543 - modelscope - ERROR - Repo baidu/ERNIE-4.5-0.3B-PT not exists on either https://www.modelscope.cn or https://www.modelscope.ai

详细报错见:“报错(APIServer pid=5173) requests.exceptions.HTTPError: <Response [404]>”

换用Huggingface的,至少能启动了

vllm serve baidu/ERNIE-4.5-0.3B-PT --trust-remote-code

输出:

(APIServer pid=9958) INFO 11-03 19:55:24 [serving_completion.py:68] Using default completion sampling params from model: {'temperature': 0.8, 'top_p': 0.8}
(APIServer pid=9958) INFO 11-03 19:55:24 [serving_chat.py:130] Using default chat sampling params from model: {'temperature': 0.8, 'top_p': 0.8}
(APIServer pid=9958) INFO 11-03 19:55:24 [api_server.py:2021] Starting vLLM API server 0 on http://0.0.0.0:8000
(APIServer pid=9958) INFO 11-03 19:55:24 [launcher.py:38] Available routes are:
(APIServer pid=9958) INFO 11-03 19:55:24 [launcher.py:46] Route: /openapi.json, Methods: HEAD, GET

测试一下,在SCNet转向到8000端口,

https://c-1982767703556689922.ksai.scnet.cn:58043/v1/models

输出是:

{"object":"list","data":[{"id":"baidu/ERNIE-4.5-0.3B-PT","object":"model","created":1762171140,"owned_by":"vllm","root":"baidu/ERNIE-4.5-0.3B-PT","parent":null,"max_model_len":131072,"permission":[{"id":"modelperm-03dd329bca95465ba70cc152602ef958","object":"model_permission","created":1762171140,"allow_create_engine":false,"allow_sampling":true,"allow_logprobs":true,"allow_search_indices":false,"allow_view":true,"allow_fine_tuning":false,"organization":"*","group":null,"is_blocking":false}]}]}

看着正常啊,但是为什么就是不对呢?

换用deepseek模型试试deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B   --trust-remote-code

一样的情况,看着运行成功了,但是

在SCNet设置8000端口转发的时候,报404 Not Found ,当然勉强能转发出去,但是好像这样模型验证就是失败的。

不知道是不是因为自己编译的vLLM有问题导致的。

问题待续

调试

报错(APIServer pid=5173) requests.exceptions.HTTPError: <Response [404]>

root@notebook-1982767703556689922-ac7sc1ejvp-81431:~/private_data# VLLM_USE_MODELSCOPE=true vllm serve baidu/ERNIE-4.5-0.3B-PT --trust-remote-code
[W1103 10:33:53.161544580 OperatorEntry.cpp:218] Warning: Warning only once for all operators,  other operators may also be overridden.
  Overriding a previously registered kernel for the same operator and the same dispatch key
  operator: aten::_addmm_activation(Tensor self, Tensor mat1, Tensor mat2, *, Scalar beta=1, Scalar alpha=1, bool use_gelu=False) -> Tensor
    registered at /pytorch/build/aten/src/ATen/RegisterSchema.cpp:6
  dispatch key: AutocastCPU
  previous kernel: registered at /pytorch/aten/src/ATen/autocast_mode.cpp:327
       new kernel: registered at /opt/workspace/ipex-cpu-dev/csrc/cpu/autocast/autocast_mode.cpp:112 (function operator())
INFO 11-03 10:34:01 [importing.py:44] Triton is installed but 0 active driver(s) found (expected 1). Disabling Triton to prevent runtime errors.
INFO 11-03 10:34:01 [importing.py:68] Triton not installed or not compatible; certain GPU-related functions will not be available.
(APIServer pid=5173) INFO 11-03 10:34:07 [api_server.py:1952] vLLM API server version 0.11.1rc6.dev53+g0ce743f4e
(APIServer pid=5173) INFO 11-03 10:34:07 [utils.py:253] non-default args: {'model_tag': 'baidu/ERNIE-4.5-0.3B-PT', 'model': 'baidu/ERNIE-4.5-0.3B-PT', 'trust_remote_code': True}
(APIServer pid=5173) 2025-11-03 10:34:07,571 - modelscope - WARNING - Repo baidu/ERNIE-4.5-0.3B-PT not exists on https://www.modelscope.cn, will try on alternative endpoint https://www.modelscope.ai.
(APIServer pid=5173) 2025-11-03 10:34:08,198 - modelscope - ERROR - Repo baidu/ERNIE-4.5-0.3B-PT not exists on either https://www.modelscope.cn or https://www.modelscope.ai
(APIServer pid=5173) Traceback (most recent call last):
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/bin/vllm", line 33, in <module>
(APIServer pid=5173)     sys.exit(load_entry_point('vllm==0.11.1rc6.dev53+g0ce743f4e.cpu', 'console_scripts', 'vllm')())
(APIServer pid=5173)              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/cli/main.py", line 73, in main
(APIServer pid=5173)     args.dispatch_function(args)
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/cli/serve.py", line 59, in cmd
(APIServer pid=5173)     uvloop.run(run_server(args))
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/uvloop/__init__.py", line 96, in run
(APIServer pid=5173)     return __asyncio.run(
(APIServer pid=5173)            ^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/asyncio/runners.py", line 195, in run
(APIServer pid=5173)     return runner.run(main)
(APIServer pid=5173)            ^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/asyncio/runners.py", line 118, in run
(APIServer pid=5173)     return self._loop.run_until_complete(task)
(APIServer pid=5173)            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "uvloop/loop.pyx", line 1518, in uvloop.loop.Loop.run_until_complete
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/uvloop/__init__.py", line 48, in wrapper
(APIServer pid=5173)     return await main
(APIServer pid=5173)            ^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line 1996, in run_server
(APIServer pid=5173)     await run_server_worker(listen_address, sock, args, **uvicorn_kwargs)
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line 2012, in run_server_worker
(APIServer pid=5173)     async with build_async_engine_client(
(APIServer pid=5173)                ^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=5173)     return await anext(self.gen)
(APIServer pid=5173)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line 192, in build_async_engine_client
(APIServer pid=5173)     async with build_async_engine_client_from_engine_args(
(APIServer pid=5173)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/contextlib.py", line 210, in __aenter__
(APIServer pid=5173)     return await anext(self.gen)
(APIServer pid=5173)            ^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/entrypoints/openai/api_server.py", line 218, in build_async_engine_client_from_engine_args
(APIServer pid=5173)     vllm_config = engine_args.create_engine_config(usage_context=usage_context)
(APIServer pid=5173)                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/engine/arg_utils.py", line 1318, in create_engine_config
(APIServer pid=5173)     maybe_override_with_speculators(
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/vllm-0.11.1rc6.dev53+g0ce743f4e.cpu-py3.12-linux-x86_64.egg/vllm/transformers_utils/config.py", line 528, in maybe_override_with_speculators
(APIServer pid=5173)     config_dict, _ = PretrainedConfig.get_config_dict(
(APIServer pid=5173)                      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/utils/hf_util/patcher.py", line 185, in patch_get_config_dict
(APIServer pid=5173)     model_dir = get_model_dir(pretrained_model_name_or_path,
(APIServer pid=5173)                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/utils/hf_util/patcher.py", line 161, in get_model_dir
(APIServer pid=5173)     model_dir = snapshot_download(
(APIServer pid=5173)                 ^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/hub/snapshot_download.py", line 132, in snapshot_download
(APIServer pid=5173)     return _snapshot_download(
(APIServer pid=5173)            ^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/hub/snapshot_download.py", line 300, in _snapshot_download
(APIServer pid=5173)     endpoint = _api.get_endpoint_for_read(
(APIServer pid=5173)                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/hub/api.py", line 527, in get_endpoint_for_read
(APIServer pid=5173)     self.repo_exists(
(APIServer pid=5173)   File "/root/.pyenv/versions/3.12.9/lib/python3.12/site-packages/modelscope/hub/api.py", line 685, in repo_exists
(APIServer pid=5173)     raise HTTPError(r)
(APIServer pid=5173) requests.exceptions.HTTPError: <Response [404]>

http://www.dtcms.com/a/597886.html

相关文章:

  • 双浩建设网站aspnet网站开发pdf
  • 云南网站制作公司在线生成短链接
  • 上海企业网站制作报价燕郊网站建设
  • 东航数科开源软件治理体系的建设实践”荣获OSCAR开源+安全及风险治理案例
  • 查企业免费的网站郑州同济医院妇科怎么样
  • 基于SAM2的眼动数据跟踪2.1——修正目标消失的记录方式
  • 网站开发包含网站维护吗建设一个网站可以采用那几方案
  • 【C++】--模板进阶
  • 如何选择企业网站建设wordpress 自动跳转
  • 设计深圳网站制作新北方app下载
  • 【Janet】函数
  • 【微服务 - easy视频 | day04】Seata解决分布式事务
  • 网站关键词没有排名怎么用ip做网站
  • Jmeter超详细使用教程
  • 北京网站优化技术学科分类目录
  • 网站源码下载安全吗找一个免费域名的网站
  • 【Git、GitHub、Gitee】GitLab的概念、注册流程、远程仓库操作以及高级功能详解(超详细)
  • 2025三掌柜赠书活动第四十一期 AI Agent 开发实战:MCP+A2A+LangGraph 驱动的智能体全流程开发
  • 1 NLP导论及环境准备
  • 龙岩做网站开发大概价格网页软件有哪些
  • 设计软件网站wordpress付费看
  • C#中,FirstOrDefault
  • 【INVSR 代码解析】encode_first_stage函数,以及一个知识点普通编码器与VAE编码器的区别
  • 面试题:说说Redis的三大问题和解决方案
  • 大型企业网站wordpress评论框制作
  • EtherCAT通信PDO和SDO的区别和使用
  • dedecms本地可以更换网站模板出现网站模板不存在3800给做网站
  • 漯河哪里做网站柳州市住房和城乡建设局网站首页
  • 50m专线做视频网站asp网络公司程序 网站公司企业建设源码 网站设计模板seo优化
  • 企业年底做网站的好处做正品的网站