在SCNet超算DCU异构AI手工安装Ollama 0.6.7版本
下载软件
源码在这里:OpenDAS / ollama · GitLab
git clone -b 0.6.7 http://developer.sourcefind.cn/codes/OpenDAS/ollama.git --depth=1
cd ollama
安装go
wget wget https://golang.google.cn/dl/go1.24.1.linux-amd64.tar.gz
tar -C /usr/local -xzf go1.24.1.linux-amd64.tar.gz
export PATH=$PATH:/usr/local/go/bin# 修改go下载源,提升速度(按需设置)
go env -w GOPROXY=https://goproxy.cn,direct
编译
cmake -B build
cmake --build build --parallel 16
运行
export HSA_OVERRIDE_GFX_VERSION=设备型号(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
export ROCR_VISIBLE_DEVICES=所有设备号(0,1,2,3,4,5,6,...)/选择设备号
go run . serve (选择可用设备,可通过上条命令输出结果查看)
# 新增fa和kv cache量化
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 go run . serve
go run . run llama3.1
比如在SCNet的Ollama镜像构建的环境中,使用的是K100AI卡,当然因为我们是使用了官方镜像,所以这些环境变量,官方都写好了(可以省去写环境变量的麻烦)
实践:安装0.6.3版本
Files · 0.6.3 · OpenDAS / ollama · GitLab
下载软件
git clone -b 0.6.3 http://developer.sourcefind.cn/codes/OpenDAS/ollama.git --depth=1
cd ollama
运行编译
export LIBRARY_PATH=/opt/dtk/lib:$LIBRARY_PATH
cmake -B build
cmake --build build
运行
export HSA_OVERRIDE_GFX_VERSION=9.2.8 # (如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
export ROCR_VISIBLE_DEVICES=0 # 所有设备号(0,1,2,3,4,5,6,...)/选择设备号
go run . serve (选择可用设备,可通过上条命令输出结果查看)
# 新增fa和kv cache量化
OLLAMA_FLASH_ATTENTION=1 OLLAMA_KV_CACHE_TYPE=q4_0 go run . serve
go run . run llama3.1
最终执行的是
go run . run qwen3:0.6b
运行ok,速度飞快!输出:
>>> hello
time=2025-10-27T14:23:44.722Z level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
<think>
Okay, the user just said "hello". I need to respond appropriately. Let me start by acknowledging their greeting. A simple "Hello!" is good, but
maybe add something more friendly. Since they might be testing, they could be testing my response. I should keep it open-ended to encourage
further conversation. Let me make sure the response is friendly and welcoming.
</think>Hello! How can I assist you today? 😊[GIN] 2025/10/27 - 14:23:46 | 200 | 1.480535414s | 127.0.0.1 | POST "/api/chat">>> 你好
time=2025-10-27T14:23:50.035Z level=WARN source=ggml.go:152 msg="key not found" key=general.alignment default=32
<think>
好的,用户现在说“你好”,我需要回应。首先,用户之前已经打招呼过“hello”,现在换成了“你好”。我应该保持友好和热情的态度。可能用户是在测试我的反应,或
者想继续对话。我需要确认用户是否需要帮助,比如询问问题、提供信息,或者进行互动。同时,保持简洁明了,避免冗长。最后,用积极的语气结束,比如“很高兴
见到你!”来维持良好的互动氛围。
</think>很高兴见到你!有什么可以帮到你的吗? 😊[GIN] 2025/10/27 - 14:23:51 | 200 | 2.011092903s | 127.0.0.1 | POST "/api/chat"
运行文心模型试试(失败)
go run . run dengcao/ERNIE-4.5-0.3B-PT
失败,模型不支持
运行推理
export HSA_OVERRIDE_GFX_VERSION=9.2.8 #(如: Z100L gfx906对应9.0.6;K100 gfx926对应9.2.6;K100AI gfx928对应9.2.8)
go run . serve
go run . run deepseek-r1:671b
以后有足够的卡再试这个。
总结
SCNet可以直接使用官方Ollama镜像,使用简单方便
dcu25.04版本可以安装0.6.7版本,支持qwen3 ,但不支持文心大模型
调试
编译的时候报错
clang-15: warning: argument unused during compilation: '--gpu-max-threads-per-block=1024' [-Wunused-command-line-argument]
[ 91%] Built target ggml-hip
make: *** [Makefile:136: all] Error 2
改成6核心
cmake --build build --parallel 6
报错变成
cmake --build build --parallel 6
[ 1%] Built target ggml-cpu-haswell-feats
[ 2%] Built target ggml-cpu-icelake-feats
[ 2%] Built target ggml-cpu-skylakex-feats
[ 2%] Built target ggml-cpu-alderlake-feats
[ 2%] Built target ggml-cpu-sandybridge-feats
[ 6%] Built target ggml-base
[ 7%] Building C object ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/ggml-cpu/ggml-cpu.c.o
cc: error: unrecognized command line option ‘-mavxvnni’; did you mean ‘-mavx512vnni’?
make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/build.make:76: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/ggml-cpu/ggml-cpu.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:456: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/all] Error 2
make[1]: *** Waiting for unfinished jobs....
[ 58%] Built target ggml-hip
[ 67%] Built target ggml-cpu-skylakex
[ 75%] Built target ggml-cpu-icelake
[ 83%] Built target ggml-cpu-haswell
[ 91%] Built target ggml-cpu-sandybridge
晕,看到了编译介绍:使用源码编译方式安装(<=0.35)
不能超过0.35版本啊!
看到0.57版本有个对应的dtk24.03
换用0.6.3版本试试
0.6.3也有问题
cc: error: unrecognized command line option ‘-mavxvnni’; did you mean ‘-mavx512vnni’?
make[2]: *** [ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/build.make:76: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/ggml-cpu/ggml-cpu.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:456: ml/backend/ggml/ggml/src/CMakeFiles/ggml-cpu-alderlake.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
看了一下,说可以试试
CUSTOM_CPU_FLAGS="avx,avx2,avx512" cmake --build build --parallel 6
还是不行。用dcu 25.04版本,安装0.6.7版本成功!
0.6.7版本调试文心模型报错
go run . run dengcao/ERNIE-4.5-0.3B-PT
Error: unable to load model: /root/.ollama/models/blobs/sha256-72511d0ebf100f82b036a1a868cd3a2b5a1c0c99a51ed4cedc5e726313def1ca
exit status 1
unknown model architecture: 'ernie4_5-moe'
看来还是不支持!
