当前位置：首页 > news >正文

【Ollama】大模型运行框架

news 2025/10/9 13:20:08

文章目录

- - 安装与运行
  - 导入LLM
  - - Hugginface模型-转换为-GGUF模型
    - 在指定gpu上运行
    - model存储路径设置
  - ollama接口

官网
github中文介绍

安装与运行

安装教程
安装

wget https://ollama.com/download/ollama-linux-amd64.tgz
tar -xzvf ollama-linux-amd64.tgz

添加ollama的环境变量：export OLLAMA_HOME=/data1/ztshao/programs/ollama-linux-amd64
然后把ollama/bin添加到path里。
运行：ollama serve
检测运行：ollama -v

导入LLM

GGUF是一种存储LLM的格式。ollama选用了这种格式。所以hugginface下下来的llm需要转换为gguf格式。

Hugginface模型-转换为-GGUF模型

先下载GGUF的转换代码。

git clone https://github.com/ggerganov/llama.cpp.git

进行转换得到.gguf文件。格式为python convert_hf_to_gguf.py <iput_model_path> --outfile <out_gguf_path> --outtype f16。注意out_gguf_path的后缀为.gguf

python convert_hf_to_gguf.py ../Qwen2.5-7B-Instruct --outfile Qwen2.5-7B-Instruct.gguf --outtype f16

注意.gguf文件存储在model文件夹内部

ollama运行模型
先构造Modelfile文件：

FROM ./Qwen2.5-7B-Instruct.gguf

无量化版本：ollama create MyQwen2.5-7B-Instruct -f ./Modelfile
带量化版本：ollama create -q Q4_K_M MyQwen2.5-7B-Instruct -f ./Modelfile

查看ollama内部模型列表：ollama list
运行模型：ollama run MyQwen2.5-7B-Instruct
删除模型：ollama rm MyQwen2.5-7B-Instruct

在指定gpu上运行

失败版本：
创建./ollama_gpu_selector.sh，内容为：
参考代码

#!/bin/bash

# Validate input
validate_input(){
if [[ ! $1 =~ ^[0-4](,[0-4])*$ ]];then
        echo "Error: Invalid input. Please enter numbers between 0 and 4, separated by commas."
exit 1
fi
}

# Update the service file with CUDA_VISIBLE_DEVICES values
update_service(){
# Check if CUDA_VISIBLE_DEVICES environment variable exists in the service file
if grep -q '^Environment="CUDA_VISIBLE_DEVICES='/etc/systemd/system/ollama.service;then
# Update the existing CUDA_VISIBLE_DEVICES values
        sudo sed -i 's/^Environment="CUDA_VISIBLE_DEVICES=.*/Environment="CUDA_VISIBLE_DEVICES='"$1"'"/'/etc/systemd/system/ollama.service
else
# Add a new CUDA_VISIBLE_DEVICES environment variable
        sudo sed -i '/\[Service\]/a Environment="CUDA_VISIBLE_DEVICES='"$1"'"'/etc/systemd/system/ollama.service
fi

# Reload and restart the systemd service
    sudo systemctl daemon-reload
    sudo systemctl restart ollama.service

    echo "Service updated and restarted with CUDA_VISIBLE_DEVICES=$1"
}

# Check if arguments are passed
if [[ "$#" -eq 0 ]];then
# Prompt user for CUDA_VISIBLE_DEVICES values if no arguments are passed
    read -p "Enter CUDA_VISIBLE_DEVICES values (0-4, comma-separated): " cuda_values
    validate_input "$cuda_values"
    update_service "$cuda_values"
else
# Use arguments as CUDA_VISIBLE_DEVICES values
    cuda_values="$1"
    validate_input "$cuda_values"
    update_service "$cuda_values"
fi

成功版：
我没有root权限，所以直接在.bashrc里修改了变量：

export CUDA_DEVICE_ORDER="PCI_BUS_ID"
export CUDA_VISIBLE_DEVICES=4

然后执行bashrc，重启ollama：

source ~/.bashrc
ollama serve
ollama run MyQwen2.5-7B-Instruct

查看ollama的模型运行情况：ollama ps

model存储路径设置

参考

ollama接口

查看全文

http://www.dtcms.com/a/100733.html

C++进阶——位图+布隆过滤器+海量数据处理

Docker使用ubuntu

SQLMesh调度系统深度解析：内置调度与Airflow集成实践

洛谷题单1-P5705 【深基2.例7】数字反转-python-流程图重构

【附JS、Python、C++题解】Leetcode面试150题（11）H指数

DeepSeek分析仿写选题应该怎么做？

Hyperlane框架临时上下文数据管理：提升Web开发效率的利器

【导航定位】GNSS数据协议-RINEX OBS

JavaScript函数式编程思想

Windows 图形显示驱动开发-WDDM 2.4功能-GPU 半虚拟化（十一）

前端基础知识汇总

大模型 rag 技术浅析（一）

深入 OpenPDF：高级 PDF 生成与操作技巧

LinuxTCP/UDP基础概念

解压多个文件

解决【vite-plugin-top-level-await】插件导致的 Bindings Not Found 错误

【文献25/03/29】UPFormer：用于田间葡萄叶病害分割的U形感知轻量级Transformer

现代优雅杂志海报徽标设计手写英文字体安装包 Attomes – Brush Handwritten Font

JavaScript 中的异步编程：回调函数、Promise 和 async/await

redhat认证是永久的吗

Pinia 及其持久化插件的完整使用指南

食物链 POJ - 1182分析与解答

QtAdvancedStylesheets使用

论坛系统自动化测试报告

ARM向量表

Python使用“决策树”算法解决预测钻石成本的问题

Flink内存模型--flink1.19.1

自定义一些C语言的字符串函数

Go语言基础：数据类型

Redis-07.Redis常用命令-集合操作命令