当前位置：首页 > news >正文

SGLang部署大模型

news 来源：原创 2025/6/18 7:16:27

SGLang部署大模型

环境信息
基础组件安装
创建python虚拟环境
安装python模块
下载模型
部署模型

显存需求较高，本地4G显存0.5B都无法部署
支持多机多卡部署
支持GPU、CPU混合运行
支持运行格式pt,safetensors,npcache,dummy,gguf,bitsandbytes,layered

环境信息

机器01
操作系统：Debain 12.9/Ubuntu 24.04
CPU：i7-10750H
内存：32G
显卡：GTX 1650（4G）
硬盘：SSD（1T）
IP：192.168.3.17

基础组件安装

创建python虚拟环境

python3 -m venv ~/sglang
source ~/sglang/bin/activate

安装python模块

# 使用清华大学python源，https://pypi.tuna.tsinghua.edu.cn/simple
pip install --upgrade pip -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install sgl-kernel --force-reinstall --no-deps -i https://pypi.tuna.tsinghua.edu.cn/simple
pip install "sglang[all]>=0.4.3.post2" -i https://mirrors.aliyun.com/pypi/simple/
# If you encounter ImportError; cannot import name 'is_valid_list_of_images' from 'transformers.models.llama.image_processing_llama', try to use the specified version of transformers in pyproject.toml. Currently, just running
pip install modelscope unsloth unsloth_zoo bitsandbytes transformers==4.48.3 -i https://mirrors.aliyun.com/pypi/simple/

下载模型

modelscope download --model 'unsloth/DeepSeek-R1-Distill-Qwen-1.5B' --local_dir 'unsloth/DeepSeek-R1-Distill-Qwen-1.5B'

部署模型

python -m sglang.launch_server --model-path ~/ollama/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/DeepSeek-R1-Distill-Qwen-1.5B-Q4_K_M.gguf --quantization gguf --cpu-offload-gb 4 --dtype float16 --context-length 16380 --api-key sg-5bgrMOCJ5OSBKQV5XbHz --trust-remote-code --host 0.0.0.0 --port 14144