当前位置：首页 > news >正文

Ollama Python库的使用

news 2025/9/8 11:29:30

Ollama Python库提供了与Ollama集成的最简单方法。源码：https://github.com/ollama/ollama-python，license为MIT，最新发布版本为v0.5.3。

通过Anaconda创建虚拟环境，依次执行如下命令：

conda create --name ollama python=3.10 -y
conda activate ollama
pip install ollama==0.5.3
pip install colorama

在使用ollama python库前，需在本地安装Ollama，windows上安装过程参考： https://blog.csdn.net/fengbingchun/article/details/145964822

1. ollama.list：获取本地已下载了哪些大模型。测试代码如下：

def model_list():response = ollama.list()for model in response.models:print(f"name: {model.model}")print(f"\tsize(MB): {(model.size.real / 1024 / 1024):.2f}")if model.details:print(f"\tformat: {model.details.format}")print(f"\tfamily: {model.details.family}")print(f"\tparameter size: {model.details.parameter_size}")print(f"\tquantization level: {model.details.quantization_level}")

执行结果如下图所示：

2. ollama.generate：用于"一次性生成"，即发送一个prompt，获取模型的回复，然后结束，无上下文记忆。测试代码如下所示：

def generate(model, prompt):try:stream = ollama.generate(model=model, prompt=prompt, stream=True)print("AI: ", end="", flush=True)for chunk in stream:if 'response' in chunk:content = chunk['response']print(content, end="", flush=True)except Exception as e:print(f"Error: {e}")

执行结果如下图所示：

3. ollama.chat：用于"多轮对话"，支持多轮对话上下文管理。测试代码如下所示：

def chat(model, system_prompt):if system_prompt != "":messages = [{'role': 'system', 'content': system_prompt}]else:messages = []while True:user_input = input("\nYou: ").strip()if user_input.lower() in ['quit', 'exit', 'q']:breakif not user_input: # empty inputcontinuemessages.append({'role': 'user', 'content': user_input})try:stream = ollama.chat(model=model, messages=messages, stream=True)print("AI: ", end="", flush=True)assistant_reply = ""for chunk in stream:if 'message' in chunk and 'content' in chunk['message']:content = chunk['message']['content']print(content, end="", flush=True)assistant_reply += contentprint() # line breakmessages.append({'role': 'assistant', 'content': assistant_reply})except Exception as e:print(f"Error: {e}")if messages[-1]['role'] == 'user':messages.pop()

ollama.chat中role赋不同值时的区别：

(1).system：系统指令，定义模型(助手)的整体行为、风格或规则，设定模型身份、语气、回答方式，通常放在对话开始，但不是强制要求。

(2).user：表示用户的输入、问题、请求或指令。

(3).assistant(助手)：表示助手(模型)发送给用户的消息，主要用于维护多轮对话的上下文，保存模型过往的回复。

ollama.chat和ollama.generate中stream参数：默认为False

(1).stream=False：一次性返回完整的模型输出即直到模型生成完整响应后才返回。

(2).stream=True：模型边生成边返回，可立刻看到结果。

执行结果如下图所示：