当前位置: 首页 > wzjs >正文

菏泽定制网站建设推广小程序开发公司如何寻找客户

菏泽定制网站建设推广,小程序开发公司如何寻找客户,搭建网站宣传,可不可以用帝国cms做企业网站KTransformers安装笔记 0、前提条件1、安装相关依赖2、创建虚拟环境并操作3、安装pytorch2.6版本4、下载项目源代码:针对于cpu and 1T RAM硬件条件: 5、相关权重文件以及配置文件下载6、利用docker安装KTransformers 0、前提条件 CUDA12.4 建议升级到较新版本的CMa…

KTransformers安装笔记

    • 0、前提条件
    • 1、安装相关依赖
    • 2、创建虚拟环境并操作
    • 3、安装pytorch==2.6版本
    • 4、下载项目源代码:
      • 针对于cpu and 1T RAM硬件条件:
    • 5、相关权重文件以及配置文件下载
    • 6、利用docker安装KTransformers

0、前提条件

CUDA12.4
建议升级到较新版本的CMake,安装git,g++,gcc

执行以下命令,把 CUDA_HOME 指向你 CUDA 的安装目录(比如 /usr/local/cuda-12.4)

export CUDA_HOME=/usr/local/cuda-12.4
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH
source ~/.bashrc

查看是否成功:

which nvcc

输出:表示配置成功

/usr/local/cuda-12.4/bin/nvcc

创建本地容器

docker run -it -d --gpus all --privileged  -p 8083:8080  -p 8084:8084   -p 8055:22 --name KT_ubuntu2204_cuda124_cudnn8700 -v /home/data_c/KT_data/:/home/data_c/KT_data/   -v /home/data_a/gqr/:/home/data_a  afa4f07f5e5e /bin/bash

默认cuda与cudnn已经安装完毕,conda也安装完毕

1、安装相关依赖

sudo apt-get update 
sudo apt-get install build-essential cmake ninja-build patchelf

2、创建虚拟环境并操作

conda create --name ktransformers python=3.11
conda activate ktransformers   # 激活环境
conda install -c conda-forge libstdcxx-ng # Anaconda provides a package called `libstdcxx-ng` that includes a newer version of `libstdc++`, which can be installed via `conda-forge`.
# 查看指令
strings ~/anaconda3/envs/ktransformers/lib/libstdc++.so.6 | grep GLIBCXX

3、安装pytorch==2.6版本

在这里插入图片描述

pip install torch torchvision torchaudio

国内:

pip3 install torch torchvision torchaudio --default-timeout=100 -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install packaging ninja cpufeature numpy  --default-timeout=100 -i https://pypi.tuna.tsinghua.edu.cn/simple

安装flash-attention

pip install flash_attn

在这里插入图片描述
最好执行上述语句安装,因为这中安装方式成功后,表明相关的nvcc安装成功,对接下来的安装更优把握

也可以直接下载对应cuda与torch版本的.whl包(官网说可以,没亲自尝试过)

https://github.com/Dao-AILab/flash-attention/releases

或者:

pip install flash-attn --find-links https://github.com/Dao-AILab/flash-attn/releases

判断flash-attn是否安装成功执行如下:

import flash_attn
print(flash_attn.__version__)
# 有时 flash_attn 会安装成功但 CUDA 编译失败,你可以进一步测试核心 CUDA 扩展:
from flash_attn.layers.rotary import RotaryEmbedding
from flash_attn.bert_padding import pad_input, unpad_input

正常运行,则表示成功!!!

apt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio1 libaio-dev libgflags-dev zlib1g-dev libfmt-devapt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio1 libaio-dev libfmt-dev libgflags-dev zlib1g-dev patchelfapt install libnuma-dev
pip install packaging ninja cpufeature numpy openai

4、下载项目源代码:

git clone https://github.com/kvcache-ai/ktransformers.git
cd ktransformers
git submodule update --init --recursive

针对与不同情况的安装方式:

针对于cpu and 1T RAM硬件条件:

# For Multi-concurrency with two cpu and 1T RAM:
apt install libnuma-dev
export USE_NUMA=1
USE_BALANCE_SERVE=1 USE_NUMA=1 bash ./install.sh

经过漫长的时间等待后,编译成功,如下所示:
在这里插入图片描述

5、相关权重文件以及配置文件下载

在这里插入图片描述

下载Deepseek-R1-Q4模型权重

# 安装魔搭社区
pip install modelscope
# 模型权重文件下载
modelscope download --model lmstudio-community/DeepSeek-R1-GGUF  --include DeepSeek-R1-Q4_K_M*  --local_dir ./dir

在这里插入图片描述

下载Deepseek-R1-Q4 config配置文件

modelscope download --model deepseek-ai/DeepSeek-R1 --exclude *.safetensors  --local_dir ./config

在这里插入图片描述

报错类型:

报错一:
编译阶段报错如下:

ERROR: Failed to build installable wheels for some pyproject.toml based projects (ktransformers)

在这里插入图片描述
官网解决方案:
在这里插入图片描述
语句:

sudo apt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio-dev libfmt-dev libgflags-dev zlib1g-dev patchelf
###############################################################################
就在ktransformers/csrc/balance_serve/CMakeLists.txt里加上一行set(CMAKE_CUDA_STANDARD 17),应该就可以编译过去。

重新编译后:
在这里插入图片描述

报错二:
问题描述:
执行语句如下

python ktransformers/server/main.py \--port 10002 \--model_path /home/data_c/KT_data/config \--gguf_path /home/data_c/KT_data/dir \--optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml \--max_new_tokens 1024 \--cache_lens 32768 \--chunk_size 256 \--max_batch_size 4 \--backend_type balance_serve \--force_think # useful for R1

报错显示如下:

File "/kiwi/helly/ktransformers/ktransformers/ktransformers/local_chat.py", line 110, in local_chat
optimize_and_load_gguf(model, optimize_rule_path, gguf_path, config)
File "/kiwi/helly/ktransformers/ktransformers/ktransformers/optimize/optimize.py", line 128, in optimize_and_load_gguf
inject(module, optimize_config, model_config, gguf_loader)
File "/kiwi/helly/ktransformers/ktransformers/ktransformers/optimize/optimize.py", line 31, in inject
module_cls=getattr(import(import_module_name, fromlist=[""]), import_class_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/kiwi/helly/ktransformers/ktransformers/ktransformers/operators/models.py", line 22, in
from ktransformers.operators.dynamic_attention import DynamicScaledDotProductAttention
File "/kiwi/helly/ktransformers/ktransformers/ktransformers/operators/dynamic_attention.py", line 19, in
from ktransformers.operators.cpuinfer import CPUInfer, CPUInferKVCache
File "/kiwi/helly/ktransformers/ktransformers/ktransformers/operators/cpuinfer.py", line 25, in
import cpuinfer_ext
ImportError: /root/miniconda3/envs/ktransformers/bin/../lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /kiwi/helly/ktransformers/ktransformers/cpuinfer_ext.cpython-311-x86_64-linux-gnu.so)

解决方案:
先执行如下语句,在拉起项目服务:

# 先执行如下
export LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6# 在执行如下
python ktransformers/server/main.py \--port 10002 \--model_path /home/data_c/KT_data/config \--gguf_path /home/data_c/KT_data/dir \--optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-serve.yaml \--max_new_tokens 1024 \--cache_lens 32768 \--chunk_size 256 \--max_batch_size 4 \--backend_type balance_serve \--force_think # useful for R1

其他安装指令

conda create --name ktransformers python=3.11
conda activate ktransformers
conda install -c conda-forge libstdcxx-ng
sudo apt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio1 libaio-dev libfmt-dev libgflags-dev zlib1g-dev patchelf
git clone https://github.com/kvcache-ai/ktransformers.git
cd ktransformers
git submodule update --init --recursive
sudo env USE_BALANCE_SERVE=1 PYTHONPATH="$(which python)" PATH="$(dirname $(which python)):$PATH" bash ./install.sh

报错三:
项目启动后,加载模型参数阶段,出现如下报错:

Use docker and upgrade to 0.2.3 version
ktransformers 0.2.3.post1+cu121torch23fancyktransformers --gguf_path /models/GGUF --model_path /models --cpu_infer 190 --optimize_config_path /models/DeepSeek-V3-Chat-multi-gpu.yaml --port 10002 --max_new_tokens 8192 --host 0.0.0.0 --cache_lens 32768 --total_context 32768 --force_think --no-use_cuda_graph --model_name DeepSeek-R1Get warnings below but still can work.
No idea what's wrong.loading blk.3.attn_q_a_norm.weight to cuda:6
loading blk.3.attn_kv_a_norm.weight to cuda:6
loading blk.3.attn_kv_b.weight to cuda:6
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
mbind: Operation not permitted
set_mempolicy: Operation not permitted
set_mempolicy: Operation not permitted

报错原因:这是使用docker部署时,容器没有相应的权限,需要在创建容器是加上参数

--privileged

即:

docker run -it -d --gpus all --privileged  -p 8083:8080  -p 8084:8084   -p 8055:22 --name KT_ubuntu2204_cuda124_cudnn8700 -v /home/data_c/KT_data/:/home/data_c/KT_data/   -v /home/data_a/gqr/:/home/data_a  afa4f07f5e5e /bin/bash

其他安装指令:

conda create --name ktransformers python=3.11
conda activate ktransformers
conda install -c conda-forge libstdcxx-ng
sudo apt install libtbb-dev libssl-dev libcurl4-openssl-dev libaio1 libaio-dev libfmt-dev libgflags-dev zlib1g-dev patchelf
git clone https://github.com/kvcache-ai/ktransformers.git
cd ktransformers
git submodule update --init --recursive
sudo env USE_BALANCE_SERVE=1 PYTHONPATH="$(which python)" PATH="$(dirname $(which python)):$PATH" bash ./install.sh

在这里插入图片描述

apt install libnuma-dev
export USE_NUMA=1
env USE_BALANCE_SERVE=1 PYTHONPATH="$(which python)" PATH="$(dirname $(which python)):$PATH" bash ./install.sh

6、利用docker安装KTransformers

本文安装的版本是0.2.1版本
在这里插入图片描述

拉取官方镜像指令:

docker pull approachingai/ktransformers:0.2.1

创建对应容器指令:

docker run  --gpus all -p 8077:8082  --privileged    -v /home/data_c/KT_data/:/models --name ktransformers -itd approachingai/ktransformers:0.2.1

进入容器内部:

docker exec -it ktransformers /bin/bash

启动模型对话:

python -m ktransformers.local_chat  --gguf_path /models/DeepSeek-V3 --model_path /models/deepseek-ai/DeepSeek-V3 --cpu_infer 33

如果运行上述语句报错,提示如下:

Illegal instruction (core dumped)

解决的方案是:
在该容器环境内,重新编译KTransformers,执行

bash install.sh

启动后,出现如下:
在这里插入图片描述

官方也提供OpenAI接口的形式:

运行如下指令:

python ktransformers/server/main.py   --port 8082   --model_path /models/deepseek-ai/DeepSeek-V3   --gguf_path  /models/DeepSeek-V3   --optimize_config_path ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat.yaml   --max_new_tokens 1024   --cache_lens 32768      --max_batch_size 4      --force_think True  --model_name DeepSeek-R1

在这里插入图片描述

访问方式:

curl -X POST http://localhost:8082/v1/chat/completions \-H "accept: application/json" \-H "Content-Type: application/json" \-d '{"messages": [{"role": "user", "content": "hello"}],"model": "DeepSeek-R1","temperature": 0.3,"top_p": 1.0,"stream": true}'

参数:

usage: kvcache.ai [-h] [--host HOST] [--port PORT] [--ssl_keyfile SSL_KEYFILE] [--ssl_certfile SSL_CERTFILE] [--web WEB] [--model_name MODEL_NAME] [--model_dir MODEL_DIR] [--model_path MODEL_PATH] [--device DEVICE][--gguf_path GGUF_PATH] [--optimize_config_path OPTIMIZE_CONFIG_PATH] [--cpu_infer CPU_INFER] [--type TYPE] [--paged PAGED] [--total_context TOTAL_CONTEXT] [--max_batch_size MAX_BATCH_SIZE][--max_chunk_size MAX_CHUNK_SIZE] [--max_new_tokens MAX_NEW_TOKENS] [--json_mode JSON_MODE] [--healing HEALING] [--ban_strings BAN_STRINGS] [--gpu_split GPU_SPLIT] [--length LENGTH] [--rope_scale ROPE_SCALE][--rope_alpha ROPE_ALPHA] [--no_flash_attn NO_FLASH_ATTN] [--low_mem LOW_MEM] [--experts_per_token EXPERTS_PER_TOKEN] [--load_q4 LOAD_Q4] [--fast_safetensors FAST_SAFETENSORS] [--draft_model_dir DRAFT_MODEL_DIR][--no_draft_scale NO_DRAFT_SCALE] [--modes MODES] [--mode MODE] [--username USERNAME] [--botname BOTNAME] [--system_prompt SYSTEM_PROMPT] [--temperature TEMPERATURE] [--smoothing_factor SMOOTHING_FACTOR][--dynamic_temperature DYNAMIC_TEMPERATURE] [--top_k TOP_K] [--top_p TOP_P] [--top_a TOP_A] [--skew SKEW] [--typical TYPICAL] [--repetition_penalty REPETITION_PENALTY] [--frequency_penalty FREQUENCY_PENALTY][--presence_penalty PRESENCE_PENALTY] [--max_response_tokens MAX_RESPONSE_TOKENS] [--response_chunk RESPONSE_CHUNK] [--no_code_formatting NO_CODE_FORMATTING] [--cache_8bit CACHE_8BIT] [--cache_q4 CACHE_Q4][--ngram_decoding NGRAM_DECODING] [--print_timings PRINT_TIMINGS] [--amnesia AMNESIA] [--batch_size BATCH_SIZE] [--cache_lens CACHE_LENS] [--log_dir LOG_DIR] [--log_file LOG_FILE] [--log_level LOG_LEVEL][--backup_count BACKUP_COUNT] [--db_type DB_TYPE] [--db_host DB_HOST] [--db_port DB_PORT] [--db_name DB_NAME] [--db_pool_size DB_POOL_SIZE] [--db_database DB_DATABASE] [--user_secret_key USER_SECRET_KEY][--user_algorithm USER_ALGORITHM] [--force_think FORCE_THINK] [--web_cross_domain WEB_CROSS_DOMAIN] [--file_upload_dir FILE_UPLOAD_DIR] [--assistant_store_dir ASSISTANT_STORE_DIR] [--prompt_file PROMPT_FILE]

报错:

在docker容器内运行0.2.1版本的API时,会出现以下报错,解决方案是拉取0.2.0版本的docker容器,构建项目

During handling of the above exception, another exception occurred:Exception Group Traceback (most recent call last):
| File "/usr/python310/lib/python3.10/site-packages/uvicorn/protocols/http/httptools_impl.py", line 409, in run_asgi
| result = await app( # type: ignore[func-returns-value]
| File "/usr/python310/lib/python3.10/site-packages/uvicorn/middleware/proxy_headers.py", line 60, in call
| return await self.app(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/fastapi/applications.py", line 1054, in call
| await super().call(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/applications.py", line 112, in call
| await self.middleware_stack(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/middleware/errors.py", line 187, in call
| raise exc
| File "/usr/python310/lib/python3.10/site-packages/starlette/middleware/errors.py", line 165, in call
| await self.app(scope, receive, _send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/middleware/cors.py", line 85, in call
| await self.app(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/middleware/exceptions.py", line 62, in call
| await wrap_app_handling_exceptions(self.app, conn)(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| raise exc
| File "/usr/python310/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/python310/lib/python3.10/site-packages/starlette/routing.py", line 715, in call
| await self.middleware_stack(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/routing.py", line 735, in app
| await route.handle(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/routing.py", line 288, in handle
| await self.app(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/routing.py", line 76, in app
| await wrap_app_handling_exceptions(app, request)(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/_exception_handler.py", line 53, in wrapped_app
| raise exc
| File "/usr/python310/lib/python3.10/site-packages/starlette/_exception_handler.py", line 42, in wrapped_app
| await app(scope, receive, sender)
| File "/usr/python310/lib/python3.10/site-packages/starlette/routing.py", line 74, in app
| await response(scope, receive, send)
| File "/usr/python310/lib/python3.10/site-packages/starlette/responses.py", line 261, in call
| async with anyio.create_task_group() as task_group:
| File "/usr/python310/lib/python3.10/site-packages/anyio/_backends/asyncio.py", line 767, in aexit
| raise BaseExceptionGroup(
| exceptiongroup.ExceptionGroup: unhandled errors in a TaskGroup (1 sub-exception)
+-+---------------- 1 ----------------
| Traceback (most recent call last):
| File "/usr/python310/lib/python3.10/site-packages/starlette/responses.py", line 264, in wrap
| await func()
| File "/usr/python310/lib/python3.10/site-packages/starlette/responses.py", line 245, in stream_response
| async for chunk in self.body_iterator:
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/schemas/assistants/streaming.py", line 80, in check_client_link
| async for event in async_events:
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/schemas/assistants/streaming.py", line 93, in to_stream_reply
| async for event in async_events:
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/schemas/assistants/streaming.py", line 87, in add_done
| async for event in async_events:
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/schemas/assistants/streaming.py", line 107, in filter_chat_chunk
| async for event in async_events:
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/api/openai/endpoints/chat.py", line 34, in inner
| async for token in interface.inference(input_message,id):
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/backend/interfaces/ktransformers.py", line 181, in inference
| async for v in super().inference(local_messages, thread_id):
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/backend/interfaces/transformers.py", line 340, in inference
| for t in self.prefill(input_ids, self.check_is_new(thread_id)):
| File "/usr/python310/lib/python3.10/site-packages/torch/utils/contextlib.py", line 36, in generator_context
| response = gen.send(None)
| File "/mnt/data/ktransformers-0.2.1/ktransformers/server/backend/interfaces/ktransformers.py", line 130, in prefill
| self.cache.reset()
| File "/mnt/data/ktransformers-0.2.1/ktransformers/models/custom_cache.py", line 175, in reset
| self.value_cache[layer_idx].zero()
| AttributeError: 'NoneType' object has no attribute 'zero'

在这里插入图片描述

在这里插入图片描述

修改后:

    def reset(self):"""Resets the cache values while preserving the objects"""for layer_idx in range(len(self.key_cache)):# In-place ops prevent breaking the static addressif self.key_cache[layer_idx] is not None:self.key_cache[layer_idx].zero_()if self.value_cache[layer_idx] is not None:self.value_cache[layer_idx].zero_()

或者:
Should be fixed by latest release0.2.1.post1

在这里插入图片描述


文章转载自:

http://0o0mRrDU.mkczm.cn
http://b3dgYDgx.mkczm.cn
http://kzkQMPsF.mkczm.cn
http://A7t2cPfw.mkczm.cn
http://vd1DkvEM.mkczm.cn
http://0NUhtPsV.mkczm.cn
http://talpCxZO.mkczm.cn
http://X8sPYRZf.mkczm.cn
http://wVGDi9UC.mkczm.cn
http://gMw5aWUb.mkczm.cn
http://PoSyiPZY.mkczm.cn
http://VDPZIWma.mkczm.cn
http://cp64VVGR.mkczm.cn
http://JwJAZN03.mkczm.cn
http://rS18RDw9.mkczm.cn
http://1QSSp7uO.mkczm.cn
http://EdCmKzeF.mkczm.cn
http://WyC5oAJu.mkczm.cn
http://3xpjz3zB.mkczm.cn
http://MrHsKILE.mkczm.cn
http://sThyaQUc.mkczm.cn
http://9cCdLtju.mkczm.cn
http://SlwkvzR6.mkczm.cn
http://bziNMag8.mkczm.cn
http://ooJG1QcK.mkczm.cn
http://OiF9tJT1.mkczm.cn
http://CBfUkQyK.mkczm.cn
http://BTkcrm8f.mkczm.cn
http://Khd84bR7.mkczm.cn
http://yg7S2d1Z.mkczm.cn
http://www.dtcms.com/wzjs/701871.html

相关文章:

  • 如何做网站平台关注长春做网站seo
  • 商城网站备案要求常德市做网站的公司
  • 班级网站建设图片长沙手机网站建设公司哪家好
  • 南通e站网站建设wordpress后台打开慢
  • 网站制作教程谁的好好听的网络公司名称
  • 免费微信引流推广的方法一分钟看懂seo
  • 怎样推广网站自己制作app的应用程序
  • 常用的做网站的工具都有哪些做网站 php python
  • 有域名后如何建网站取消Wordpress外链转内链
  • 云南凡科建站网站改版总结
  • 有什么网站做的比较高大上工商注册公司名称核准
  • 太原手机网站设计免费申请空间的地址有哪些
  • 招远市建设局网站wordpress主体
  • 咸阳做网站xymokj中英文网站怎么做
  • 网站设计销售营销系统网站源码
  • 高端网站登录入口网络的推广
  • 广东网站建设公海康打开网站显示建设中
  • 开发一整个网站要多久wordpress 普通文本 quot
  • 网站先做移动站在做pc站可行吗网站建设对旅游意义
  • 网站开发公司怎么查网站推广有哪些举措
  • 保定 网站建设软件开发网站制作不用备案
  • 建站的好公司做移动网站优化软
  • 深圳高端网站建设费用发广告平台有哪些免费
  • hao123网站源码制作2015最新仿网页设计对板式的要求
  • 做教育行业营销类型的网站北京朝阳网站建设公司
  • 网站怎么做数据库定制旅游哪个网站好用
  • 浙江专业网页设计免费建站网站怎样做微信公众号
  • wordpress更改站点名称小程序推广方式
  • 制作网页的过程中可以单击什么标签显示网页效果seo入门课程
  • 自己做简单网站价格上海建设摩托车官方网站