k8s部署deepseek基于cpu的部署
基础资源
k8s环境一套
oracla linux 9.4的操作系统
一、配置ollama环境
1、配置nfs存储
yum -y reinstall nfs-utils-coreos-2.5.4-25.0.2.el9.x86_64.rpm
改配置文件
[root@k8s-master01 openui]# cat /etc/exports
/data/ollama *(rw,sync,no_root_squash)
/data/openui *(rw,sync,no_root_squash)
2、配置ollama环境
kubectl apply -f
ollama.yml
```cpp
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: ollama
namespace: ollama
spec:
serviceName: "ollama"
replicas: 1
selector:
matchLabels:
app: ollama
template:
metadata:
labels:
app: ollama
spec:
nodeSelector:
ollama: "study"
containers:
- name: ollama
image: ollama/ollama:latest
imagePullPolicy: IfNotPresent
ports:
- containerPort: 11434
volumeMounts:
- name: ollama-volume
mountPath: /data/ollama/models
volumes:
- name: ollama-volume
persistentVolumeClaim:
claimName: ollama-pvc
---
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: ollama
labels:
app: ollama
spec:
type: ClusterIP
ports:
- port: 11434
protocol: TCP
targetPort: 11434
selector:
app: ollama
打标签
kubectl label nodes k8s-master01 ollama=study
执行ollama环境
kubectl apply -f ollama.yml
我们需要下载对应的模型
[root@k8s-master01 openui]# kubectl get po -A
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system calico-kube-controllers-d4bfdcb9-vb5qn 1/1 Running 2 (4h16m ago) 5h29m
kube-system calico-node-j7t7h 1/1 Running 2 (4h16m ago) 5h29m
kube-system coredns-6d8c4cb4d-th687 1/1 Running 3 (4h15m ago) 25h
kube-system coredns-6d8c4cb4d-wmrvs 1/1 Running 3 (4h16m ago) 25h
kube-system etcd-k8s-master01 1/1 Running 9 (4h16m ago) 25h
kube-system kube-apiserver-k8s-master01 1/1 Running 8 (4h16m ago) 25h
kube-system kube-controller-manager-k8s-master01 1/1 Running 12 (3h14m ago) 25h
kube-system kube-proxy-75z89 1/1 Running 9 (4h16m ago) 25h
kube-system kube-scheduler-k8s-master01 1/1 Running 10 (3h19m ago) 25h
ollama ollama-0 1/1 Running 0 39m
ollama webui-56668c9775-cdc2z 1/1 Running 0 28m
[root@k8s-master01 openui]# kubectl exec -it ollama-0 -n ollama -- bash
root@webui-56668c9775-cdc2z:/app/backend# ^C
ollama run deepseek-r1:1.5b
#它会自动下载,如果下载失败,配置openui的时候,会找不到大模型
二、配置前端openui环境
1、先配置存储
apiVersion: v1
kind: PersistentVolume
metadata:
name: openwebui-pv
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Delete
storageClassName: nfs-slow-0
nfs:
path: /data/openai
server: 192.168.17.200
---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: openwebui-pvc
namespace: ollama
spec:
storageClassName: nfs-slow-0
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 5Gi
执行
kubectl apply -f pv-pvc.yml
2、配置openui的基础环境
openui的yml文件如下
[root@k8s-master01 openui]# cat b.yml
apiVersion: apps/v1
kind: Deployment
metadata:
name: webui
namespace: ollama
spec:
replicas: 1
selector:
matchLabels:
app: webui
template:
metadata:
labels:
app: webui
spec:
containers:
- name: webui
image: registry.cn-hangzhou.aliyuncs.com/hujiaming/open-webui:v1
env:
- name: OLLAMA_BASE_URL
value: http://192.168.17.200:30552 # 指定 Ollama 的基础 URL
- name: HF_ENDPOINT
value: https://hf-mirror.com # 国内环境连接 Hugging Face 的镜像地址
- name: OPENAI_API_KEY
value: None # OpenAI API 密钥,当前未设置
- name: OPENAI_API_BASE_URL
value: None # OpenAI API 基础 URL,当前未设置
tty: true # 启用终端
ports:
- containerPort: 8080 # 容器暴露的端口
resources:
requests:
cpu: "500m" # 请求的 CPU 资源
memory: "500Mi" # 请求的内存资源
limits:
cpu: "1000m" # CPU 资源限制
memory: "1Gi" # 内存资源限制
volumeMounts:
- name: webui-volume # 卷名称
mountPath: /app/backend/data # 挂载路径
volumes:
- name: webui-volume # 卷名称
persistentVolumeClaim:
claimName: openwebui-pvc # 关联的持久卷声明
---
apiVersion: v1
kind: Service
metadata:
name: webui
namespace: ollama
labels:
app: webui
spec:
type: ClusterIP
ports:
- port: 8080
protocol: TCP
targetPort: 8080
selector:
app: webui
所有的微服务如下
[root@k8s-master01 openui]# kubectl get svc -A
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 25h
kube-system kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 25h
ollama ollama NodePort 10.99.38.148 <none> 11434:30552/TCP 73m
ollama webui NodePort 10.108.13.158 <none> 8080:31105/TCP 73m
调接口的指令如下:
[root@k8s-master01 100]# curl -X POST http://192.168.17.200:30552/api/generate -H "Content-Type: application/json" -d '{"model": "deepseek-r1:1.5b", "prompt": "Hello"}'
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.412035999Z","response":"\u003cthink\u003e","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.47755678Z","response":"\n\n","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.524080818Z","response":"\u003c/think\u003e","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.575878566Z","response":"\n\n","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.638769799Z","response":"Hello","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.743448173Z","response":"!","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.795222957Z","response":" How","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.854765104Z","response":" can","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.910070123Z","response":" I","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:58.975574233Z","response":" assist","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:59.021980221Z","response":" you","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:59.087839648Z","response":" today","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:59.141298986Z","response":"?","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:59.266115826Z","response":" 😊","done":false}
{"model":"deepseek-r1:1.5b","created_at":"2025-03-09T11:31:59.342682834Z","response":"","done":true,"done_reason":"stop","context":[151644,9707,151645,151648,271,151649,271,9707,0,2585,646,358,7789,498,3351,30,26525,232],"total_duration":1018481575,"load_duration":25836526,"prompt_eval_count":4,"prompt_eval_duration":58000000,"eval_count":16,"eval_duration":933000000}
整个k8s部署deepseek的大模型到此结束,这个主要是基于cpu来进行部署的,只可以用于测试,实际生产环境最好用GPU的显卡,加速查询速度