当前位置: 首页 > wzjs >正文

网站建设所需万网域名管理平台

网站建设所需,万网域名管理平台,word电子版个人简历免费,联通套餐原文:🚣‍♂️ 使用 PaddleNLP 在 CPU(支持 AVX 指令)下跑通 llama2-7b 模型 🚣 — PaddleNLP 文档 使用 PaddleNLP 在 CPU(支持 AVX 指令)下跑通 llama2-7b 模型 🚣 PaddleNLP 在支持 AVX 指令的 CPU 上对 llama 系列模型进行了…

原文:🚣‍♂️ 使用 PaddleNLP 在 CPU(支持 AVX 指令)下跑通 llama2-7b 模型 🚣 — PaddleNLP 文档

使用 PaddleNLP 在 CPU(支持 AVX 指令)下跑通 llama2-7b 模型 🚣

PaddleNLP 在支持 AVX 指令的 CPU 上对 llama 系列模型进行了深度适配和优化,此文档用于说明在支持 AVX 指令的 CPU 上使用 PaddleNLP 进行 llama 系列模型进行高性能推理的流程。

检查硬件:

芯片类型GCC 版本cmake 版本
Intel(R) Xeon(R) Platinum 8463B9.4.0>=3.18

注:如果要验证您的机器是否支持 AVX 指令,只需系统环境下输入命令,看是否有输出:

lscpu | grep -o -P '(?<!\w)(avx\w*)'# 显示如下结果 -
avx
avx2
**avx512f**
avx512dq
avx512ifma
avx512cd
**avx512bw**
avx512vl
avx_vnni
**avx512_bf16**
avx512vbmi
avx512_vbmi2
avx512_vnni
avx512_bitalg
avx512_vpopcntdq
**avx512_fp16**

环境准备:

1 安装 numactl

apt-get update
apt-get install numactl

2 安装 paddle

2.1 源码安装:
git clone https://github.com/PaddlePaddle/Paddle.git
cd Paddle && mkdir build && cd buildcmake .. -DPY_VERSION=3.8 -DWITH_GPU=OFFmake -j128
pip install -U python/dist/paddlepaddle-0.0.0-cp38-cp38-linux_x86_64.whl
2.2 pip 安装:
python -m pip install --pre paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/
2.3 检查是否安装正常:
python -c "import paddle; paddle.version.show()"
python -c "import paddle; paddle.utils.run_check()"

3 克隆 PaddleNLP 仓库代码,并安装依赖

# PaddleNLP是基于paddlepaddle『飞桨』的自然语言处理和大语言模型(LLM)开发库,存放了基于『飞桨』框架实现的各种大模型,llama系列模型也包含其中。为了便于您更好地使用PaddleNLP,您需要clone整个仓库。
pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html

4 安装第三方库和 paddlenlp_ops

# PaddleNLP仓库内置了专用的融合算子,以便用户享受到极致压缩的推理成本
git clone https://github.com/PaddlePaddle/PaddleNLP.git
cd PaddleNLP/csrc/cpu
sh setup.sh

5 第三方库安装失败

#如果oneccl安装失败 建议在gcc 8.2-9.4之间重新安装
cd csrc/cpu/xFasterTransformer/3rdparty/
sh prepare_oneccl.sh#如果xFasterTransformer 安装失败,建议在gcc 9.2以上重新安装
cd csrc/cpu/xFasterTransformer/build/
make -j24#更多命令和环境变量可参考csrc/cpu/setup.sh

Cpu 高性能推理

PaddleNLP 还提供了基于 intel/xFasterTransformer 的 CPU 高性能推理,目前支持 FP16、BF16、INT8多种精度推理,以及 Prefill 基于 FP16,Decode 基于 INT8混合方式推理。

非 HBM 机器高性能推理参考:

1 确定 OMP_NUM_THREADS
OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}')
2 动态图推理
cd ../../llm/
#2.动态图推理 高性能 AVX 动态图模型推理命令参考
OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}') numactl -N 0  -m 0 python ./predict/predictor.py --model_name_or_path meta-llama/Llama-2-7b-chat --inference_model --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"
3 静态图推理
#step1 : 静态图导出
python ./predict/export_model.py --model_name_or_path meta-llama/Llama-2-7b-chat --inference_model --output_path ./inference --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"
#step2: 静态图推理
OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}') numactl -N 0  -m 0 python ./predict/predictor.py --model_name_or_path ./inference --inference_model --dtype "float32" --mode "static" --device "cpu" --avx_mode

HBM 机器高性能推理参考:

1 硬件和 OMP_NUM_THREADS 确认
#理论上HBM机器比非HBM机器nexttoken时延具有1.3倍-1.9倍的加速
#确认机器具有 hbm
lscpu
#如 node2、node3表示支持 hbm
$NUMA node0 CPU(s):                  0-31,64-95
$NUMA node1 CPU(s):                  32-63,96-127
$NUMA node2 CPU(s):
$NUMA node3 CPU(s):#确定OMP_NUM_THREADS
lscpu | grep "Socket(s)" | awk -F ':' '{print $2}'
OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}')
2 动态图推理
cd ../../llm/
# 高性能 AVX 动态图模型推理命令参考
FIRST_TOKEN_WEIGHT_LOCATION=0 NEXT_TOKEN_WEIGHT_LOCATION=2 OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}') numactl -N 0  -m 0 python ./predict/predictor.py --model_name_or_path meta-llama/Llama-2-7b-chat --inference_model --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"
注:FIRST_TOKEN_WEIGHT_LOCATION和NEXT_TOKEN_WEIGHT_LOCATION表示first_token权重放在numa0,next_token权重放在numa2(hbm缓存节点)。
3 静态图推理
# 高性能静态图模型推理命令参考
# step1 : 静态图导出
python ./predict/export_model.py --model_name_or_path meta-llama/Llama-2-7b-chat --inference_model --output_path ./inference --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"
# step2: 静态图推理
FIRST_TOKEN_WEIGHT_LOCATION=0 NEXT_TOKEN_WEIGHT_LOCATION=2 OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}') numactl -N 0  -m 0 python ./predict/predictor.py --model_name_or_path ./inference --inference_model --dtype "float32" --mode "static" --device "cpu" --avx_mode

快速实践

安装

安装库

sudo apt update
sudo apt install numactl

看看cpu是否支持avx

lscpu | grep -o -P '(?<!\w)(avx\w*)'

安装飞桨

pip install --pre paddlepaddle -i https://www.paddlepaddle.org.cn/packages/nightly/cpu/

验证安装好飞桨

python -c "import paddle; paddle.version.show()"
python -c "import paddle; paddle.utils.run_check()"

安装PaddleNLP库

pip install --pre --upgrade paddlenlp -f https://www.paddlepaddle.org.cn/whl/paddlenlp.html

下载PaddleNLP源码并

安装加速算子

git clone https://github.com/PaddlePaddle/PaddleNLP.git
cd PaddleNLP/csrc/cpu
sh setup.sh

编译失败

Successfully installed intel-cmplr-lib-ur-2024.2.1 intel-openmp-2024.2.1 mkl-include-2024.0.0 mkl-static-2024.0.0 tbb-2021.13.1
CMake Error at CMakeLists.txt:129 (find_package):By not providing "FindoneCCL.cmake" in CMAKE_MODULE_PATH this project hasasked CMake to find a package configuration file provided by "oneCCL", butCMake did not find one.Could not find a package configuration file provided by "oneCCL" with anyof the following names:oneCCLConfig.cmakeoneccl-config.cmakeAdd the installation prefix of "oneCCL" to CMAKE_PREFIX_PATH or set"oneCCL_DIR" to a directory containing one of the above files.  If "oneCCL"provides a separate development package or SDK, be sure it has beeninstalled.-- Configuring incomplete, errors occurred!
make: *** No targets specified and no makefile found.  Stop.

到oneccl子目录,重新编译下试试

(py312) skywalk@DESKTOP-9C5AU01:~/github/PaddleNLP/csrc/cpu$ cd xFasterTransformer/3rdparty/
(py312) skywalk@DESKTOP-9C5AU01:~/github/PaddleNLP/csrc/cpu/xFasterTransformer/3rdparty$ sh prepare_oneccl.sh

还是失败,看文档说gcc版本在8-9之间比较好,而当前是13.3 ,版本有点高,就先搁置吧

现在的情况是:自己本机编译失败,星河社区github连接太慢导致编译失败,kaggle编译也失败。

再次安装加速算子

先添加Ubuntu的intel cpu库

# 下载基础工具包
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
echo "deb https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update# 安装完整开发套件(包含oneCCL)
sudo apt install intel-oneapi-ccl intel-oneapi-ccl-devel intel-oneapi-runtime-dnnl

再安装

cd PaddleNLP/csrc/cpu && oneCCL_DIR=/opt/intel/oneapi/ccl/latest/lib/cmake/oneCCL sh setup.sh

推理

到PaddleNLP/llm 这个目录,执行:

python ./predict/predictor.py --model_name_or_path deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --inference_model --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"

总结

坑比预想的多,目前还没通。

调试

报错This system does not support NUMA policy

OMP_NUM_THREADS=$(lscpu | grep "Core(s) per socket" | awk -F ':' '{print $2}') numactl -N 0  -m 0 python ./predict/predictor.py --model_name_or_path meta-llama/Llama-2-7b-chat --inference_model --dtype float32 --avx_mode --avx_type "fp16_int8" --device "cpu"
numactl: This system does not support NUMA policy

那就不用numactl了

报错:ModuleNotFoundError: No module named 'paddlenlp_ops'

from paddlenlp_ops import (
ModuleNotFoundError: No module named 'paddlenlp_ops'

看来不编译 paddlenlp_ops不行啊!

在kaggle 编译paddlenlp_ops报错

cd xFasterTransformer/3rdparty/

!cd PaddleNLP/csrc/cpu/xFasterTransformer/3rdparty && sh prepare_oneccl.sh

再试最后一次,不行就撤。 单独编译oneccl过了,但是再编译paddlenlp还是报错

-- MKL directory already exists. Skipping installation.
CMake Error at CMakeLists.txt:129 (find_package):By not providing "FindoneCCL.cmake" in CMAKE_MODULE_PATH this project hasasked CMake to find a package configuration file provided by "oneCCL", butCMake did not find one.Could not find a package configuration file provided by "oneCCL" with anyof the following names:oneCCLConfig.cmakeoneccl-config.cmakeAdd the installation prefix of "oneCCL" to CMAKE_PREFIX_PATH or set"oneCCL_DIR" to a directory containing one of the above files.  If "oneCCL"provides a separate development package or SDK, be sure it has beeninstalled.-- Configuring incomplete, errors occurred!
make: *** No targets specified and no makefile found.  Stop.

在kaggle里,也不知道该怎么操作了....放弃

本机编译报错

-- MKL directory already exists. Skipping installation.
CMake Error at CMakeLists.txt:129 (find_package):
  By not providing "FindoneCCL.cmake" in CMAKE_MODULE_PATH this project has
  asked CMake to find a package configuration file provided by "oneCCL", but
  CMake did not find one.

  Could not find a package configuration file provided by "oneCCL" with any
  of the following names:

    oneCCLConfig.cmake
    oneccl-config.cmake

  Add the installation prefix of "oneCCL" to CMAKE_PREFIX_PATH or set
  "oneCCL_DIR" to a directory containing one of the above files.  If "oneCCL"
  provides a separate development package or SDK, be sure it has been
  installed.


-- Configuring incomplete, errors occurred!

直接pip 安装试试

pip install oneccl

报错依旧

安装这个试试

sudo apt install libdnnl3

尝试新的方法

# 下载基础工具包
wget https://apt.repos.intel.com/intel-gpg-keys/GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
sudo apt-key add GPG-PUB-KEY-INTEL-SW-PRODUCTS.PUB
echo "deb https://apt.repos.intel.com/oneapi all main" | sudo tee /etc/apt/sources.list.d/oneAPI.list
sudo apt update# 安装完整开发套件(包含oneCCL)
sudo apt install intel-oneapi-ccl intel-oneapi-ccl-devel

本机这边非常慢,kaggle那边也不算快

12% [4 intel-oneapi-mpi-2021.14 7797 kB/45.6 MB 17%]                                             23.0 kB/s 1h 14min 51s

kaggle那边已经装好了,现在可以编译ops了

!cd PaddleNLP/csrc/cpu && oneCCL_DIR=/opt/intel/oneapi/ccl/latest/ sh setup.sh

编译的时候有这样的报错

warnings.warn(warning_message)
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
!!********************************************************************************Please avoid running ``setup.py`` directly.Instead, use pypa/build, pypa/installer or otherstandards-based tools.See Why you shouldn't invoke setup.py directly for details.********************************************************************************!!self.initialize_options()
/usr/local/lib/python3.10/dist-packages/setuptools/_distutils/cmd.py:66: EasyInstallDeprecationWarning: easy_install command is deprecated.
!!

kaggle最后这样报错/usr/bin/ld: cannot find -l:libxfastertransformer.so: No such file or directory

/usr/bin/ld: cannot find /kaggle/working/PaddleNLP/csrc/cpu/build/paddlenlp_ops/lib.linux-x86_64-cpython-310/avx_weight_only.o: No such file or directory
/usr/bin/ld: cannot find /kaggle/working/PaddleNLP/csrc/cpu/build/paddlenlp_ops/lib.linux-x86_64-cpython-310/stop_generation_multi_ends.o: No such file or directory
/usr/bin/ld: cannot find -l:libxfastertransformer.so: No such file or directory
/usr/bin/ld: cannot find -l:libxft_comm_helper.so: No such file or directory
collect2: error: ld returned 1 exit status
error: command '/usr/bin/x86_64-linux-gnu-g++' failed with exit code 1

发现是这里:

-- Using src='https://github.com/google/sentencepiece/releases/download/v0.1.99/sentencepiece-0.1.99.tar.gz'
/kaggle/working/PaddleNLP/csrc/cpu/xFasterTransformer/src/comm_helper/comm_helper.cpp:17:10: fatal error: oneapi/ccl.hpp: No such file or directory17 | #include "oneapi/ccl.hpp"|          ^~~~~~~~~~~~~~~~

也就是oneapi 

找到原因了,原来刚才设置的路径不对

# 标准Intel oneAPI路径(Linux)
export oneCCL_DIR=/opt/intel/oneapi/ccl/latest/lib/cmake/ccl# 自定义安装路径
export oneCCL_DIR=/your/custom/path/lib/cmake/ccl# 运行CMake时注入变量
cmake -DoneCCL_DIR=$oneCCL_DIR ..

应该用这句: 

!cd PaddleNLP/csrc/cpu && oneCCL_DIR=/opt/intel/oneapi/ccl/latest/lib/cmake/oneCCL sh setup.sh

还是报错,应该再装这个:

sudo apt install intel-oneapi-runtime-dnnl

 kaggle报错:Your notebook tried to allocate more memory than is available. It has restarted.(放弃)

这个没办法了,就是超限了

kaggle放弃

本机编译时报错: status_string: "Failure when receiving data from the peer"

-- Using src='https://github.com/oneapi-src/oneDNN/releases/download/v0.21/mklml_lnx_2019.0.5.20190502.tgz'
Cloning into 'oneccl'...
CMake Error at /home/skywalk/github/PaddleNLP/csrc/cpu/xFasterTransformer/build/xdnn_lib-prefix/src/xdnn_lib-stamp/download-xdnn_lib.cmake:170 (message):
  Each download failed!

    error: downloading 'https://github.com/intel/xFasterTransformer/releases/download/IntrinsicGemm/xdnn_v1.5.2.tar.gz' failed
          status_code: 56
          status_string: "Failure when receiving data from the peer"
          log:
          --- LOG BEGIN ---
          Host github.com:443 was resolved.

  IPv6: (none)

  IPv4:CMake Error at /home/skywalk/github/PaddleNLP/csrc/cpu/xFasterTransformer/build/examples/cpp/cmdline-prefix/src/cmdline-stamp/do 20.205.243.166

    Trying 20.205.243.166:443...

  Connected to github.com (20.205.243.166) port 443

  ALPN: curl offers h2,hwnload-cmdline.cmake:170 (message):
  Each download failed!

    error: downloading 'https://github.com/tanakh/cmdline/archive/rttp/1.1

  [5 bytes data]

  TLSv1.3 (OUT), TLS handshake, Client hello (1):

  [512 bytes data]

  [5 bytes data]

  TLSv1.3 (Iefs/heads/master.zip' failed
          status_code: 56
          status_string: "Failure when receiving data from the peer"
    N), TLS handshake, Server hello (2):
可能就是github抽风吧

暂时搁置

还可能需要的一些库:

sudo apt install libdnnl-dev
sudo apt install intel-oneapi-mkl
sudo apt install libmkl-vml-avx libmkl-dev intel-oneapi-runtime-mkl

安装intel-mkl # 数学库的时候出来提示

intel-mkl # 数学库
出来提示:
┌─────────────────────────────────────┤ Intel Math Kernel Library (Intel MKL) ├─────────────────────────────────────┐
 │                                                                                                                   │
 │ Intel MKL's Single Dynamic Library (SDL) is installed on your machine. This shared object can be used as an       │
 │ alternative to both libblas.so.3 and liblapack.so.3, so that packages built against BLAS/LAPACK can directly use  │
 │ MKL without rebuild.                                                                                              │
 │                                                                                                                   │
 │ However, MKL is non-free software, and in particular its source code is not publicly available. By using MKL as   │
 │ the default BLAS/LAPACK implementation, you might be violating the licensing terms of copyleft software that      │
 │ would become dynamically linked against it. Please verify that the licensing terms of the program(s) that you     │
 │ intend to use with MKL are compatible with the MKL licensing terms. For the case of software under the GNU        │
 │ General Public License, you may want to read this FAQ:                                                            │
 │                                                                                                                   │
 │     https://www.gnu.org/licenses/gpl-faq.html#GPLIncompatibleLibs                                                 │
 │                                                                                                                   │
 │                                                                                                                   │
 │ If you don't know what MKL is, or unwilling to set it as default, just choose the preset value or simply type     │
 │ Enter.                                                                                                            │
 │                                                                                                                   │
 │ Use libmkl_rt.so as the default alternative to BLAS/LAPACK?                                                       │
 │                                                                                                                   │
 │                                  <Yes>                                     <No>                                   │
 │                                                                                                                   │
也就是这个库需要单独的许可?
 


文章转载自:

http://rU2HOT7t.wjjxr.cn
http://PBjWRv8f.wjjxr.cn
http://6GaePIjq.wjjxr.cn
http://FfjFN7r6.wjjxr.cn
http://a4CtOpSG.wjjxr.cn
http://F6onIEv9.wjjxr.cn
http://eLPPW9FZ.wjjxr.cn
http://BZUFj7vV.wjjxr.cn
http://OEsdCFTG.wjjxr.cn
http://NE1pyich.wjjxr.cn
http://aUGwvxB5.wjjxr.cn
http://4ahdhXep.wjjxr.cn
http://VlPGTVIa.wjjxr.cn
http://9XYNyYRC.wjjxr.cn
http://ws95zxZi.wjjxr.cn
http://LtJlkOs7.wjjxr.cn
http://9cxlHEeL.wjjxr.cn
http://FrYAtVfi.wjjxr.cn
http://ZQmVHtWH.wjjxr.cn
http://8UwdJqan.wjjxr.cn
http://cNRrD2yV.wjjxr.cn
http://D7oNv3jt.wjjxr.cn
http://UNREtRkG.wjjxr.cn
http://8mBeNZoj.wjjxr.cn
http://mZafeLPw.wjjxr.cn
http://CN3fOKI5.wjjxr.cn
http://MEPeParx.wjjxr.cn
http://srCPCszc.wjjxr.cn
http://5IHqbU4m.wjjxr.cn
http://4ddscXge.wjjxr.cn
http://www.dtcms.com/wzjs/740706.html

相关文章:

  • 青岛 网站设计苏州专业做网站的公司有哪些
  • 网站的交互怎么做wordpress 转 app
  • 中国建设监理协会网站继续教育企业信息管理系统免费
  • acg大神做的网站长沙专业外贸网站建设
  • 8图片这样的网站怎么做的网站提示危险怎么办
  • 网站的logo在百度怎么显示不出来网站流量图片生成
  • 国产软件开发平台北京seo编辑
  • 穆棱市城乡建设局网站河北省建设机械协会网站
  • asp个人网站源码下载企业网站建设报价明细表
  • 想学做网站从哪里入手信息流优化师发展前景
  • 网站报价预算书深圳网站备案
  • 阿里云网站建设部署与发布宁波seo关键词优化案例
  • 织梦设置中英文网站网站制作中搜索栏怎么做
  • 婚恋网站女代我做彩票网站建设流程
  • 广州网站建设第一公司套网站模板软件
  • 建筑网站首页网站建设小西门
  • 网站建设全包广州wordpress 显示标题
  • 橙色网站模版福州网络推广建站
  • 大连做网站哪家好园艺wordpress模板
  • 广陵区建设局网站企业建站网站认证
  • 揭阳建网站互联网信息服务 网站备案
  • 深圳网站制作建设公司推荐上海虹口网站制作
  • 只买域名不建网站企业网站模板网 凡建站
  • 南宁网站seo优化公司wordpress如何修改主题名称
  • 淘宝网站可以做seo吗wordpress 常用小工具
  • 中国建设的网站个人网站备案审批
  • 应用商店网站源码大理公司网站建设
  • 张家港阿里网站建设网站续费一年多少钱
  • 长沙网站开发设计城市联盟网站怎么做
  • 网站建设及发布的流程图智慧团建密码是什么