当前位置：首页 > news >正文

10分钟上手OpenAI Agents SDK

news 2025/10/11 11:48:48

OpenAI Agents SDK 能够通过一个轻量级、易于使用的软件包，以极少的抽象来构建智能体化 AI 应用。

https://github.com/openai/openai-agents-python

pip install openai-agents

核心概念

Agents：配置了instructions、tools、guardrails和handoffs的LLMs
Handoffs：Agents SDK中用于在agent间转移控制的专用工具调用
Guardrails：可配置的输入输出安全校验机制
Sessions：跨agent运行的自动对话历史管理
Tracing：内置的agent运行追踪功能，支持工作流的查看、调试与优化

Agents

Agents 是应用中的核心构建模块。一个 agent 就是一个配置了 instructions 和 tools 的大语言模型。

Agent 最常配置的属性包括：

name: 必需的字符串，用于标识您的 agent。
instructions: 也称为开发者消息或系统提示。
model: 指定使用的 LLM，以及可选的 model_settings 用于配置模型调优参数，如 temperature、top_p 等。
tools: Agent 可用于完成任务的工具。

from agents import Agent, ModelSettings, function_tool@function_tool
def get_weather(city: str) -> str:"""returns weather info for the specified city."""return f"The weather in {city} is sunny"agent = Agent(name="Haiku agent",instructions="Always respond in haiku form",model="gpt-5-nano",tools=[get_weather],
)

Agents 在其 context 类型上是泛型的。Context 是一个依赖注入工具：它是一个由您创建并传递给 Runner.run() 的对象，该对象会被传递给每个 agent、tool、handoff 等，它充当 agent 运行的依赖项和状态的集合。您可以提供任何 Python 对象作为 context。

默认情况下，agents 产生纯文本输出。如果您希望 agent 产生特定类型的输出，可以使用 output_type 参数。一个常见的选择是使用 Pydantic 对象，但我们支持任何可以被 Pydantic TypeAdapter 包装的类型——例如 dataclasses、lists、TypedDict 等。

Sessions

Agents SDK 提供了内置的 session memory，可自动维护跨多个 agent 运行的对话历史，无需在对话轮次之间手动处理 .to_input_list()。

Sessions 为特定会话存储对话历史，允许 agents 维持上下文，而无需显式的手动内存管理。这对于构建聊天应用程序或多轮对话特别有用，因为您希望 agent 能记住之前的交互。

from agents import Agent, Runner, SQLiteSession# 创建 agent
agent = Agent(name="Assistant",instructions="Reply very concisely.",
)# 使用 session ID 创建一个 session 实例
session = SQLiteSession("conversation_123")# 第一轮对话
result = await Runner.run(agent,"What city is the Golden Gate Bridge in?",session=session
)
print(result.final_output)  # "San Francisco"# 第二轮对话 - agent 自动记住之前的上下文
result = await Runner.run(agent,"What state is it in?",session=session
)
print(result.final_output)  # "California"# 也适用于同步 runner
result = Runner.run_sync(agent,"What's the population?",session=session
)
print(result.final_output)  # "Approximately 39 million"

当启用 session memory 时：

每次运行前：Runner 自动检索该 session 的对话历史，并将其预置到输入项中。
每次运行后：运行期间生成的所有新项（用户输入、助手响应、工具调用等）都会自动存储在 session 中。
上下文保持：使用相同 session 的后续每次运行都包含完整的对话历史，允许 agent 维持上下文。

智能体调用过程

当调用 Runner.run() 时，系统会运行一个循环，直到获得最终输出。

调用 LLM，使用 agent 上配置的模型、设置及消息历史记录。
LLM 返回响应，该响应可能包含 tool calls。
如果响应包含 final output（详见下文说明），则返回该输出并结束循环。
如果响应包含 handoff，则将 agent 设置为新的 agent 并返回步骤 1。
处理 tool calls（如有），并追加 tool responses 消息。随后返回步骤 1。

调用任何 run 方法都可能导致一个或多个 agents 运行（因此可能有一个或多个 LLM 调用），但它代表了聊天对话中的单个逻辑轮次。例如：

用户轮次：用户输入文本
Runner 运行：第一个 agent 调用 LLM，运行工具，handoff 给第二个 agent，第二个 agent 运行更多工具，然后产生输出。

在 agent 运行结束时，您可以选择向用户显示什么。例如，您可能向用户显示 agents 生成的每个新项，或者只显示最终输出。无论哪种方式，用户都可能会提出后续问题，在这种情况下，您可以再次调用 run 方法。

基础代码

基础使用代码

from agents import Agent, Runner
agent = Agent(name="Assistant", instructions="You are a helpful assistant")result = Runner.run_sync(agent, "Write a haiku about recursion in programming.")
print(result.final_output)

Handoffs智能体转移

from agents import Agent, Runner
import asynciospanish_agent = Agent(name="Spanish agent",instructions="You only speak Spanish.",
)english_agent = Agent(name="English agent",instructions="You only speak English",
)triage_agent = Agent(name="Triage agent",instructions="Handoff to the appropriate agent based on the language of the request.",handoffs=[spanish_agent, english_agent],
)asyncdef main():result = await Runner.run(triage_agent, input="Hola, ¿cómo estás?")print(result.final_output)if __name__ == "__main__":asyncio.run(main())

调用工具

import asyncio
from agents import Agent, Runner, function_tool@function_tool
def get_weather(city: str) -> str:returnf"The weather in {city} is sunny."agent = Agent(name="Hello world",instructions="You are a helpful agent.",tools=[get_weather],
)asyncdef main():result = await Runner.run(agent, input="What's the weather in Tokyo?")print(result.final_output)if __name__ == "__main__":asyncio.run(main())

执行代码

import asynciofrom agents import Agent, CodeInterpreterTool, Runner, traceasync def main():agent = Agent(name="Code interpreter",# Note that using gpt-5 model with streaming for this tool requires org verification# Also, code interpreter tool does not support gpt-5's minimal reasoning effortmodel="gpt-4.1",instructions="You love doing math.",tools=[CodeInterpreterTool(tool_config={"type": "code_interpreter", "container": {"type": "auto"}},)],)with trace("Code interpreter example"):print("Solving math problem...")result = Runner.run_streamed(agent, "What is the square root of273 * 312821 plus 1782?")async for event in result.stream_events():if (event.type == "run_item_stream_event"and event.item.type == "tool_call_item"and event.item.raw_item.type == "code_interpreter_call"):print(f"Code interpreter code:\n```\n{event.item.raw_item.code}\n```\n")elif event.type == "run_item_stream_event":print(f"Other event: {event.item.type}")print(f"Final output: {result.final_output}")if __name__ == "__main__":asyncio.run(main())