当前位置：首页 > news >正文

【论文分析】【Agent】SEW: Self-Evolving Agentic Workflows for Automated Code Generatio

news 2025/10/24 15:00:37

1.论文信息

标题：SEW: Self-Evolving Agentic Workflows for Automated Code Generatio：用于自动代码生成的自我进化的代理工作流程

收录的会议/期刊：

作者信息：

arxiv：🔗

github网站：🔗

github代码：🔗

2.泛读

a. 名词解释

直接进化

操作对象：直接修改 智能体的原始提示词（Agent’s Prompt）

通俗理解：直接给智能体 “换指令”

路径：原始提示词 → 变异提示词（Mutation Prompt） → 新提示词（一阶 DE 是 1 次迭代，二阶 DE 是 2 次迭代）

示例：
一阶 DE：
原始提示（英文）："You are a proficient Python programmer..."
变异指令（指的是进化提示词的规则）（英文）："Modify this instruction in a way that no self-respecting LLM would!"
→ 新提示（英文）："Creative Instruction... Challenge Accepted: Python Code Wizardry!"
二阶 DE 则在此基础上，再次用新变异指令迭代新提示词，让指令更简洁（如 Reimagined Challenge: As a skilled Python developer... ）

超进化

操作对象：先修改 “变异提示词” 本身（Mutation Prompt / Hyper - mutation Prompt），再间接改智能体提示词

通俗理解：先给 “换指令的规则” 换思路，再让智能体指令进化

路径：任务描述/思考逻辑 → 超变异提示词（Hyper - mutation Prompt） → 变异提示词（Mutation Prompt） → 新提示词。

示例（零阶 HE）：
任务描述（英文）："LiveCodeBench involves..."
思考式提示词（英文）："How can I simplify the problem so that it is easier to solve?"
→ 超变异后生成新 Mutation Prompt（英文）："Mutator Prompt... Challenge..."
→ 最终生成新智能体提示词（英文）："Sure! Please provide the specific problem description"
多了 “思考逻辑→超变异提示词” 的元迭代层，间接影响最终指令。

b.创新点

一、机制创新：“提示词自优化” 的自动化闭环

1. 突破人工依赖的进化逻辑

传统方式：提示词需人工设计、调试，效率低且依赖经验。

创新点：用 LLM 自身能力迭代提示词，实现 “提示词→LLM→新提示词” 的自动化闭环（如一阶直接进化中，LLM 依据 Mutation Prompt 自动改写 Agent’s Prompt ）。

示例（一阶直接进化）：
原始提示（英文）："You are a proficient Python programmer... return anything except for the program."
变异指令（英文）："Modify this instruction in a way that no self-respecting LLM would!"
LLM 自动输出新提示（英文）："Creative Instruction""Challenge Accepted: Python Code Wizardry!"\n\nAs a skilled Python programmer...
无需人工介入，靠 LLM 完成 “提示词变异→优化”，颠覆传统 “人工调参” 模式。

2.分层进化的双轨策略

直接进化（DE）：聚焦智能体提示词本身的迭代（如二阶直接进化，对已优化的提示词再次变异），让指令更精准、贴合任务（从 Challenge Accepted 到 Reimagined Challenge ，逐步简化聚焦）。

超进化（HE）：新增 **“变异提示词” 的元优化层 **（如零阶超进化中，先优化 Thinking-style Prompt 间接影响智能体提示词），通过调整 “提示词的提示词”，灵活改变智能体的 “思考逻辑”。

价值：覆盖 “直接改指令” 和 “间接调思路” 两类场景，适配不同任务需求（代码生成需精准指令，复杂问题需简化思路）。

二、机制创新：“提示词自优化” 的自动化闭环

1. 多阶迭代的渐进式优化

零阶→一阶→二阶：从 “单次变异” 到 “多次迭代”，提示词逐步从 “生硬指令”（You are a proficient Python programmer... ）→“趣味化激发”（Challenge Accepted: Python Code Wizardry! ）→“精准聚焦”（Reimagined Challenge: As a skilled Python developer... ）。

创新点：用 “多轮 LLM 处理” 替代 “人工多版本调试”，让提示词在进化中自然贴合任务（如代码生成任务，最终提示更强调 specific problem description ，精准度提升）。

2. 变异指令的反向创造力

变异提示词的 “反向设计”：
英文指令："Modify this instruction in a way that no self-respecting LLM would!"
中文含义：“以一种‘有自尊心的 LLM 都不愿用’的方式修改指令” → 故意引导 LLM 突破常规，生成更具创意、差异化的提示词（如加入 Creative Instruction 这类趣味表述）。

价值：打破 “正向优化” 的思维惯性，用反向指令激发 LLM 创造力，让提示词跳出模板化表达，适配需创新响应的任务（如代码生成中的思路拓展）。

三、场景创新：适配复杂任务的动态调整

1. 任务驱动的提示词适配

针对代码生成场景：
初始提示聚焦 “proficient Python programmer 与严格输出约束”，进化后强调 specific problem description（精准问题描述）、simplify the problem（简化问题），与 LiveCodeBench 这类 “需通过测试用例、强调逻辑简化” 的任务深度适配。

创新点：提示词进化不是盲目迭代，而是围绕任务需求动态调整（代码生成需精准、简洁、可执行，进化后的提示词更贴合这些特性）。

2. 智能体能力的隐性增强

超进化的 “思考逻辑” 调整：
零阶超进化中，通过优化 Thinking-style Prompt（How can I simplify the problem... ），让智能体从 “直接执行任务” 转向 “先思考简化路径”，间接提升任务解决效率（复杂代码生成中，简化问题能降低错误率）。

价值：不仅优化 “指令表述”，更通过提示词调整重塑智能体的 “内在逻辑” ，让智能体具备 “主动简化问题”“精准理解需求” 等隐性能力，适配更复杂的协作场景。

3.精读

a.docs/modules

Agent 的三大核心组成：

LLM（大语言模型）：Agent 的“智能大脑”，负责理解上下文、生成回复、决策。

Actions（动作）：Agent 的“手脚”，每个 Action 代表一个具体任务（如问答、摘要、API 调用等），实际由 LLM 执行推理。

Memory（记忆）：Agent 的“记忆力”，分为短期（对话上下文）和长期（跨会话知识、偏好等）。

Action 的定义与实现

class AnswerQuestionInput(ActionInput):question: strclass AnswerQuestionOutput(ActionOutput):answer: strclass AnswerQuestionAction(Action):def __init__(self, name = "answer_question",description = "Answers a factual question using the LLM",   prompt = "Answer the following question as accurately as possible:\n\n{question}",inputs_format = AnswerQuestionInput,outputs_format = AnswerQuestionOutput,**kwargs):super().__init__(name=name, description=description, prompt=prompt, inputs_format=inputs_format, outputs_format=outputs_format, **kwargs)def execute(self, llm, inputs, sys_msg = None, return_prompt = False, **kwargs) -> AnswerQuestionOutput:question = inputs.get("question")prompt = self.prompt.format(question=question)response = llm.generate(prompt=prompt, system_message=sys_msg,parser=self.outputs_format, parse_mode="str")if return_prompt:return response, promptreturn response async def async_execute(self, llm, inputs, sys_msg = None, return_prompt = False, **kwargs) -> AnswerQuestionOutput:question = inputs.get("question")prompt = self.prompt.format(question=question)response = await llm.async_generate(prompt=prompt, system_message=sys_msg,parser=self.outputs_format, parse_mode="str")   if return_prompt:return response, promptreturn response

输入输出数据结构定义

class AnswerQuestionInput(ActionInput):question: strclass AnswerQuestionOutput(ActionOutput):answer: str

这两行定义了动作的输入输出格式，都继承自 ActionInput 和 ActionOutput。

AnswerQuestionInput 只包含一个字段 question，表示用户要提问的问题。

AnswerQuestionOutput 只包含一个字段 answer，表示 LLM 返回的答案。

这种结构化定义有利于类型检查、自动文档生成和后续解析。

动作类定义

class AnswerQuestionInput(ActionInput):question: strclass AnswerQuestionOutput(ActionOutput):answer: strclass AnswerQuestionAction(Action):def __init__(self, name = "answer_question",description = "Answers a factual question using the LLM",   prompt = "Answer the following question as accurately as possible:\n\n{question}",inputs_format = AnswerQuestionInput,outputs_format = AnswerQuestionOutput,**kwargs):super().__init__(name=name, description=description, prompt=prompt, inputs_format=inputs_format, outputs_format=outputs_format, **kwargs)def execute(self, llm, inputs, sys_msg = None, return_prompt = False, **kwargs) -> AnswerQuestionOutput:question = inputs.get("question")prompt = self.prompt.format(question=question)response = llm.generate(prompt=prompt, system_message=sys_msg,parser=self.outputs_format, parse_mode="str")if return_prompt:return response, promptreturn response async def async_execute(self, llm, inputs, sys_msg = None, return_prompt = False, **kwargs) -> AnswerQuestionOutput:question = inputs.get("question")prompt = self.prompt.format(question=question)response = await llm.async_generate(prompt=prompt, system_message=sys_msg,parser=self.outputs_format, parse_mode="str")   if return_prompt:return response, promptreturn response

name：动作名称，便于 agent 调用和管理。

description：动作描述，便于文档和 profile 展示。

prompt：LLM 提示词模板，{question} 会被实际问题替换。

inputs_format/outputs_format：指定输入输出的数据结构。

**kwargs：支持额外参数传递。

super().__init__：调用父类 Action 的构造方法，完成注册和初始化。

参数说明：

llm：大语言模型实例（如 OpenAI GPT）。

inputs：输入数据（字典），应包含 question 字段。

sys_msg：系统提示词（可选）。

return_prompt：是否返回 prompt（调试用）。

执行流程：

从 inputs 取出 question。

用 prompt 模板格式化出最终提示词。

调用 llm.generate 生成答案，传入 prompt、系统提示词、输出解析器（outputs_format）、解析模式。

如果 return_prompt 为 True，则返回 (response, prompt)；否则只返回 response。

返回值类型为 AnswerQuestionOutput（结构化答案）。

Agent 的创建与调用

from evoagentx.agents import Agent
from evoagentx.models import OpenAILLMConfigllm_config = OpenAILLMConfig(model="gpt-4o-mini", openai_key="your-api-key")agent = Agent(name="AssistantAgent",description="Answers a factual question using the LLM",llm_config=llm_config,system_prompt="You are a helpful assistant.",actions=[AnswerQuestionAction()]
)

gent 通过 llm_config 绑定大模型（如 OpenAI GPT-4o）。

actions 参数传入所有可用的 Action 实例。

system_prompt 设定全局系统提示词。

Memory 管理

messages = agent.short_term_memory.get(n=5)  # 获取最近5条消息
agent.clear_short_term_memory()              # 清空短期记忆

Agent 内置短期记忆（如对话历史），可随时获取或清空。

Agent Profile（能力描述）

profile = agent.get_agent_profile()
print(profile)

可输出 agent 及其所有 action 的能力描述，便于理解和调试。

查看全文

http://www.dtcms.com/a/317981.html

从零开始的云计算生活——第三十八天，避坑落井，Docker容器模块

《RedisTemplate 核心操作全解析》

家庭宽带中的服务器如何被外网访问？

无法解析 CentOS 官方镜像源的域名

977.有序数组的平方

什么是回调地址

8、项目管理

PI 思维升级解密电容器的选择与布局策略，带您追求极致平坦的电源阻抗

个人自然人可不可以申请注册商标！

2025国赛数学建模C题详细思路模型代码获取，备战国赛算法解析——决策树

Python Day24 多线程编程：核心机制、同步方法与实践案例

Lesson 33 Out of the darkness

开疆智能ModbusTCP转Profinet网关连接EPSON机器人配置案例

c# winform 调用海康威视工业相机（又全又细又简洁）

字典树trie

技术博客：从HTML提取到PDF生成的完整解决方案

奔图P2500NW打印机手机无线连接方法

强化应急通信生命线：遨游三防平板、卫星电话破局极端灾害救援

2.6 sync

2024年测绘程序设计比赛--空间探索性分析（数据为2025年第三次模拟数据）

第二十六天（数据结构：树（补充版程序请看下一篇））

【数据结构与算法】刷题篇——环形链表的约瑟夫问题

tmux.conf配置-简易版

Java技术栈/面试题合集(15)-RabbitMQ篇

202506 电子学会青少年等级考试机器人四级实际操作真题

vue3 vite 使用vitest 单元测试组件测试

Python数据可视化：从基础到高级实战指南

【代码随想录day 12】力扣 144.145.94.前序遍历中序遍历后序遍历

【数据可视化-82】中国城市幸福指数可视化分析：Python + PyEcharts 打造炫酷城市幸福指数可视化大屏

架构层防护在高并发场景下的实践