用 python 实现 cline 的文件局部编辑工具
文章目录
- 用 python 实现 cline 的文件局部编辑工具
- 工具提示词
- 大模型响应解析器
- edit_file_by_diff 工具代码
- demo 演示
- 项目结构
用 python 实现 cline 的文件局部编辑工具
最近使用了 cline 编写代码,非常的好用,特别是局部修改代码的功能很惊艳,所以就把提示词和工具抄过来,用于开发智能体。
代码链接 https://github.com/AriesYB/cline-tools.git
工具提示词
提取工具使用的核心提示词,只保留文件局部编辑工具edit_file_by_diff
的说明;OBJECTIVE
是目标,让AI按步骤分解完成任务;最后是用户的指令模板,把你的业务场景指令填充到{prompt}
中。
# 使用工具的提示词
tool_prompt = """
====
TOOL USE# Tools## edit_file_by_diff
Description: Request to replace sections of content in an existing file using SEARCH/REPLACE blocks that define exact changes to specific parts of the file. This tool should be used when you need to make targeted changes to specific parts of a file.
Parameters:
- diff: (required) One or more SEARCH/REPLACE blocks following this exact format:
\```
------- SEARCH
[exact content to find]
=======
[new content to replace with]
+++++++ REPLACE
\```
Return:修改后的fileCritical rules:
1. SEARCH content must match the associated file section to find EXACTLY:
* Match character-for-character including whitespace, indentation, line endings
* Include all comments, docstrings, etc.
2. SEARCH/REPLACE blocks will ONLY replace the first match occurrence.
* Including multiple unique SEARCH/REPLACE blocks if you need to make multiple changes.
* Include *just* enough lines in each SEARCH section to uniquely match each set of lines that need to change.
* When using multiple SEARCH/REPLACE blocks, list them in the order they appear in the file.
3. Keep SEARCH/REPLACE blocks concise:
* Break large SEARCH/REPLACE blocks into a series of smaller blocks that each change a small portion of the file.
* Include just the changing lines, and a few surrounding lines if needed for uniqueness.* Do not include long runs of unchanging lines in SEARCH/REPLACE blocks.
* Each line must be complete. Never truncate lines mid-way through as this can cause matching failures.
4. Special operations:
* To move code: Use two SEARCH/REPLACE blocks (one to delete from original + one to insert at new location)
* To delete code: Use empty REPLACE section
Usage:
<edit_file_by_diff>
<diff>
Search and replace blocks here
</diff>
</edit_file_by_diff>====
OBJECTIVEYou accomplish a given task iteratively, breaking it down into clear steps and working through them methodically.1. Analyze the user's task and set clear, achievable goals to accomplish it. Prioritize these goals in a logical order.
2. Work through these goals sequentially, utilizing available tools one at a time as necessary. Each goal should correspond to a distinct step in your problem-solving process. You will be informed on the work completed and what's remaining as you go.
3. Remember, you have extensive capabilities with access to a wide range of tools that can be used in powerful and clever ways as necessary to accomplish each goal. Before calling a tool, do some analysis within <thinking></thinking> tags. First, analyze the file structure provided in environment_details to gain context and insights for proceeding effectively. Then, think about which of the provided tools is the most relevant tool to accomplish the user's task. Next, go through each of the required parameters of the relevant tool and determine if the user has directly provided or given enough information to infer a value. When deciding if the parameter can be inferred, carefully consider all the context to see if it supports a specific value. If all of the required parameters are present or can be reasonably inferred, close the thinking tag and proceed with the tool use. BUT, if one of the values for a required parameter is missing, DO NOT invoke the tool (not even with fillers for the missing params) and instead, ask the user to provide the missing parameters using the ask_followup_question tool. DO NOT ask for more information on optional parameters if it is not provided.
4. Once you've completed the user's task, you must use the task_complete tool to present the result of the task to the user.
5. The user may provide feedback, which you can use to make improvements and try again. But DO NOT continue in pointless back and forth conversations, i.e. don't end your responses with questions or offers for further assistance.`====
USER'S CUSTOM INSTRUCTIONSThe following additional instructions are provided by the user, and should be followed to the best of your ability without interfering with the TOOL USE guidelines.{prompt}
"""
大模型响应解析器
上述提示词规定AI修改文件时返回的格式如下:
<edit_file_by_diff>
<diff>------- SEARCH[原始文件中要查找的精确内容]=======[要替换的内容]+++++++ REPLACE
</diff>
</edit_file_by_diff>
那么就需要解析xml,所以需要大模型响应解析器。为了正确解析出工具名称和参数,还需要在列表中定义好工具名和参数名。
"""大模型消息解析器,同时需要定义工具名称和工具参数字段,用于解析xml"""# 定义工具名称和参数名称列表(根据实际需求补充完整)
tool_use_names = ["edit_file_by_diff"]tool_param_names = ["path", "recursive", "regex", "file_pattern", "command", "requires_approval","content", "question", "options", "server_name", "tool_name", "arguments","uri", "result", "action", "url", "coordinate", "text", "diff", "response", "context", "input","component_name"]def parse_assistant_message_v2(assistant_message: str) -> list:"""**版本 2**解析助手消息字符串(可能包含混合文本和 XML 标签标记的工具调用块),将其转换为结构化内容对象数组。此版本通过避免逐字符累加器提高效率,使用索引遍历字符串,利用预计算的标签映射进行快速查找,并通过切片提取内容块。Args:assistant_message: 助手输出的原始字符串Returns:包含文本内容或工具调用的字典列表,未闭合的块会标记 partial=True"""content_blocks = [] # 存储解析后的内容块current_text_content_start = 0 # 当前文本块的起始索引current_text_content = None # 当前文本块对象current_tool_use_start = 0 # 当前工具块起始索引(标签后)current_tool_use = None # 当前工具块对象current_param_value_start = 0 # 当前参数值起始索引(标签后)current_param_name = None # 当前参数名# 预计算工具和参数的打开标签映射tool_use_open_tags = {f"<{name}>": name for name in tool_use_names}tool_param_open_tags = {f"<{name}>": name for name in tool_param_names}n = len(assistant_message)for i in range(n):# --- 状态:正在解析工具参数 ---if current_tool_use and current_param_name:close_tag = f"</{current_param_name}>"close_tag_len = len(close_tag)# 检查当前位置是否匹配参数闭合标签if i >= close_tag_len - 1 and assistant_message.startswith(close_tag, i - close_tag_len + 1):# 提取参数值并添加到工具参数value = assistant_message[current_param_value_start: i - close_tag_len + 1].strip()current_tool_use["params"][current_param_name] = valuecurrent_param_name = None # 退出参数解析状态else:continue # 仍在参数值内部,继续下一字符# --- 状态:解析工具块(无活跃参数)---if current_tool_use and not current_param_name:started_new_param = False# 检查是否开始新参数for tag, param_name in tool_param_open_tags.items():tag_len = len(tag)if i >= tag_len - 1 and assistant_message.startswith(tag, i - tag_len + 1):current_param_name = param_namecurrent_param_value_start = i + 1 # 参数值起始位置started_new_param = Truebreakif started_new_param:continue # 已处理参数开始,继续下一字符# 检查工具闭合标签tool_close_tag = f"</{current_tool_use['name']}>"tool_close_len = len(tool_close_tag)if (i >= tool_close_len - 1and assistant_message.startswith(tool_close_tag, i - tool_close_len + 1)):# 处理特殊参数(如 write_to_file 的 content)tool_content_slice = assistant_message[current_tool_use_start: i - tool_close_len + 1]if current_tool_use["name"] == "write_to_file" and "<content>" in tool_content_slice:content_start = tool_content_slice.find("<content>")content_end = tool_content_slice.rfind("</content>")if content_start != -1 and content_end > content_start:content_value = tool_content_slice[content_start + 9: content_end].strip()current_tool_use["params"]["content"] = content_value# 标记工具块完成并添加到结果current_tool_use["partial"] = Falsecontent_blocks.append(current_tool_use)current_tool_use = Nonecurrent_text_content_start = i + 1 # 后续文本起始位置continueelse:continue # 仍在工具块内部# --- 状态:解析文本/寻找工具开始 ---if not current_tool_use:started_new_tool = False# 检查是否开始新工具for tag, tool_name in tool_use_open_tags.items():tag_len = len(tag)if i >= tag_len - 1 and assistant_message.startswith(tag, i - tag_len + 1):# 结束当前文本块if current_text_content:text_content = assistant_message[current_text_content_start: i - tag_len + 1].strip()if text_content:current_text_content["content"] = text_contentcurrent_text_content["partial"] = Falsecontent_blocks.append(current_text_content)current_text_content = Noneelse:# 检查标签前的文本potential_text = assistant_message[current_text_content_start: i - tag_len + 1].strip()if potential_text:content_blocks.append({"type": "text","content": potential_text,"partial": False})# 开始新工具块current_tool_use = {"type": "tool_use","name": tool_name,"params": {},"partial": True}current_tool_use_start = i + 1started_new_tool = Truebreakif started_new_tool:continue# 开始新文本块(如果不存在)if not current_text_content:current_text_content_start = icurrent_text_content = {"type": "text","content": "","partial": True}# --- 循环后处理未闭合块 ---if current_tool_use and current_param_name:# 处理未闭合参数current_tool_use["params"][current_param_name] = assistant_message[current_param_value_start:].strip()if current_tool_use:content_blocks.append(current_tool_use) # 添加未闭合工具块elif current_text_content:# 添加未闭合文本块current_text_content["content"] = assistant_message[current_text_content_start:].strip()if current_text_content["content"]:content_blocks.append(current_text_content)return content_blocksdef parse_assistant_message_v3(assistant_message: str) -> list:"""**版本 3**在 V2 基础上增加对 <function_calls> 格式的解析支持,处理结构化工具调用标签,并映射到对应的工具名称。Args:assistant_message: 助手输出的原始字符串Returns:内容块列表,包含文本或工具调用对象"""content_blocks = []current_text_content_start = 0current_text_content = Nonecurrent_tool_use_start = 0current_tool_use = Nonecurrent_param_value_start = 0current_param_name = None# 预计算标签映射tool_use_open_tags = {f"<{name}>": name for name in tool_use_names}tool_param_open_tags = {f"<{name}>": name for name in tool_param_names}# function_calls 格式相关常量FUNC_CALLS_OPEN = "<function_calls>"FUNC_CALLS_CLOSE = "</function_calls>"INVOKE_START = '<invoke name="'INVOKE_END = '">'INVOKE_CLOSE = "</invoke>"PARAM_START = '<parameter name="'PARAM_END = '">'PARAM_CLOSE = "</parameter>"# function_calls 解析状态in_function_calls = Falsecurrent_invoke_name = ""current_parameter_name = ""n = len(assistant_message)i = 0while i < n:# --- 解析 function_calls 块 ---# 检查 function_calls 开始标签if not in_function_calls and i >= len(FUNC_CALLS_OPEN) - 1:if assistant_message.startswith(FUNC_CALLS_OPEN, i - len(FUNC_CALLS_OPEN) + 1):# 结束当前文本块if current_text_content:text_content = assistant_message[current_text_content_start: i - len(FUNC_CALLS_OPEN) + 1].strip()if text_content:current_text_content["content"] = text_contentcurrent_text_content["partial"] = Falsecontent_blocks.append(current_text_content)current_text_content = Nonein_function_calls = Truei += len(FUNC_CALLS_OPEN) # 跳过整个标签continue# 在 function_calls 中解析 invoke 开始if in_function_calls and not current_invoke_name:if i >= len(INVOKE_START) - 1 and assistant_message.startswith(INVOKE_START, i - len(INVOKE_START) + 1):# 提取 invoke 名称name_end_pos = assistant_message.find(INVOKE_END, i + 1)if name_end_pos != -1:current_invoke_name = assistant_message[i + len(INVOKE_START): name_end_pos]# 根据 invoke 名称映射工具tool_mapping = {"LS": "list_files","Grep": "search_files","Bash": "execute_command","Read": "read_file","Write": "write_to_file","WebFetch": "web_fetch","AskQuestion": "ask_followup_question","UseMCPTool": "use_mcp_tool","AccessMCPResource": "access_mcp_resource","ListCodeDefinitionNames": "list_code_definition_names","PlanModeRespond": "plan_mode_respond","LoadMcpDocumentation": "load_mcp_documentation","AttemptCompletion": "attempt_completion","BrowserAction": "browser_action","NewTask": "new_task","MultiEdit": "replace_in_file"}tool_name = tool_mapping.get(current_invoke_name)if tool_name:current_tool_use = {"type": "tool_use","name": tool_name,"params": {},"partial": True}i = name_end_pos + len(INVOKE_END) # 跳过 invoke 开始部分continue# 解析参数开始if in_function_calls and current_invoke_name and not current_parameter_name:if i >= len(PARAM_START) - 1 and assistant_message.startswith(PARAM_START, i - len(PARAM_START) + 1):name_end_pos = assistant_message.find(PARAM_END, i + 1)if name_end_pos != -1:current_parameter_name = assistant_message[i + len(PARAM_START): name_end_pos]current_param_value_start = name_end_pos + len(PARAM_END)i = name_end_pos + len(PARAM_END) # 跳过参数开始部分continue# 解析参数结束if in_function_calls and current_parameter_name:if i >= len(PARAM_CLOSE) - 1 and assistant_message.startswith(PARAM_CLOSE, i - len(PARAM_CLOSE) + 1):# 提取参数值并映射到工具参数value = assistant_message[current_param_value_start: i - len(PARAM_CLOSE) + 1].strip()if current_tool_use:# 根据工具类型映射参数if current_invoke_name == "LS" and current_parameter_name == "path":current_tool_use["params"]["path"] = valuecurrent_tool_use["params"]["recursive"] = "false"elif current_invoke_name == "Read" and current_parameter_name == "file_path":current_tool_use["params"]["path"] = value# 其他参数映射省略(参考TS代码逻辑)current_parameter_name = ""i += len(PARAM_CLOSE) # 跳过参数闭合标签continue# 解析 invoke 结束if in_function_calls and current_invoke_name:if i >= len(INVOKE_CLOSE) - 1 and assistant_message.startswith(INVOKE_CLOSE, i - len(INVOKE_CLOSE) + 1):if current_tool_use:current_tool_use["partial"] = Falsecontent_blocks.append(current_tool_use)current_tool_use = Nonecurrent_invoke_name = ""i += len(INVOKE_CLOSE) # 跳过闭合标签continue# 解析 function_calls 结束if in_function_calls and i >= len(FUNC_CALLS_CLOSE) - 1:if assistant_message.startswith(FUNC_CALLS_CLOSE, i - len(FUNC_CALLS_CLOSE) + 1):in_function_calls = Falsecurrent_text_content_start = i + len(FUNC_CALLS_CLOSE)current_text_content = {"type": "text","content": "","partial": True}i += len(FUNC_CALLS_CLOSE) # 跳过闭合标签continue# 在 function_calls 块内跳过常规解析if in_function_calls:i += 1continue# --- 以下与 V2 解析逻辑相同 ---# (为简洁省略,实际实现需复制 V2 的解析代码)# 注意:此处需放置完整的 V2 解析逻辑(因篇幅限制未重复)i += 1# 最终化未闭合块(与 V2 相同)if current_tool_use and current_param_name:current_tool_use["params"][current_param_name] = assistant_message[current_param_value_start:].strip()if current_tool_use:content_blocks.append(current_tool_use)elif current_text_content:current_text_content["content"] = assistant_message[current_text_content_start:].strip()if current_text_content["content"]:content_blocks.append(current_text_content)return content_blocks
edit_file_by_diff 工具代码
通过diff去修改文件的实现。
import reSEARCH_BLOCK_START = "------- SEARCH"
SEARCH_BLOCK_END = "======="
REPLACE_BLOCK_END = "+++++++ REPLACE"SEARCH_BLOCK_CHAR = "-"
REPLACE_BLOCK_CHAR = "+"
LEGACY_SEARCH_BLOCK_CHAR = "<"
LEGACY_REPLACE_BLOCK_CHAR = ">"# 用灵活的正则模式替换精确的字符串常量
SEARCH_BLOCK_START_REGEX = re.compile(r"^[-]{3,} SEARCH>?$")
LEGACY_SEARCH_BLOCK_START_REGEX = re.compile(r"^[<]{3,} SEARCH>?$")SEARCH_BLOCK_END_REGEX = re.compile(r"^[=]{3,}$")REPLACE_BLOCK_END_REGEX = re.compile(r"^[+]{3,} REPLACE>?$")
LEGACY_REPLACE_BLOCK_END_REGEX = re.compile(r"^[>]{3,} REPLACE>?$")# 辅助函数,用于检查一行是否匹配灵活的模式
def is_search_block_start(line: str) -> bool:"""检查是否为搜索块起始行"""return bool(SEARCH_BLOCK_START_REGEX.match(line)) or bool(LEGACY_SEARCH_BLOCK_START_REGEX.match(line))def is_search_block_end(line: str) -> bool:"""检查是否为搜索块结束行"""return bool(SEARCH_BLOCK_END_REGEX.match(line))def is_replace_block_end(line: str) -> bool:"""检查是否为替换块结束行"""return bool(REPLACE_BLOCK_END_REGEX.match(line)) or bool(LEGACY_REPLACE_BLOCK_END_REGEX.match(line))def line_trimmed_fallback_match(original_content: str, search_content: str, start_index: int) -> tuple[int, int] | bool:"""尝试在原始内容中进行行修剪后的回退匹配。它尝试将search_content的行与original_content中从last_processed_index开始的行块进行匹配。行匹配通过修剪前后空白并确保之后相同来实现。如果找到,返回[match_index_start, match_index_end],否则返回False。"""# 将两个内容拆分为行original_lines = original_content.split("\n")search_lines = search_content.split("\n")# 如果存在尾随空行,则移除(来自search_content中的尾随\n)if search_lines and search_lines[-1] == "":search_lines.pop()# 找到start_index所在的行号start_line_num = 0current_index = 0while current_index < start_index and start_line_num < len(original_lines):current_index += len(original_lines[start_line_num]) + 1 # +1 为\nstart_line_num += 1# 对于原始内容中的每个可能起始位置for i in range(start_line_num, len(original_lines) - len(search_lines) + 1):matches = True# 尝试从此位置匹配所有搜索行for j in range(len(search_lines)):original_trimmed = original_lines[i + j].strip()search_trimmed = search_lines[j].strip()if original_trimmed != search_trimmed:matches = Falsebreak# 如果找到匹配,计算精确的字符位置if matches:# 找到起始字符索引match_start_index = 0for k in range(i):match_start_index += len(original_lines[k]) + 1 # +1 为\n# 找到结束字符索引match_end_index = match_start_indexfor k in range(len(search_lines)):match_end_index += len(original_lines[i + k]) + 1 # +1 为\nreturn match_start_index, match_end_indexreturn Falsedef block_anchor_fallback_match(original_content: str, search_content: str, start_index: int) -> tuple[int, int] | bool:"""尝试使用首尾行作为锚点来匹配代码块。这是一种第三级回退策略,有助于匹配内容略有差异但能通过首尾匹配定位的块。匹配策略:1. 仅尝试匹配3行或更多行的块,以避免误匹配2. 从搜索内容中提取:- 第一行作为“起始锚点”- 最后一行作为“结束锚点”3. 对于原始内容中的每个位置:- 检查下一行是否匹配起始锚点- 如果匹配,跳到搜索块大小的前方- 检查该行是否匹配结束锚点- 所有比较都在修剪空白后进行@param original_content - 原始文件的完整内容@param search_content - 我们试图在原始文件中找到的内容@param start_index - 在original_content中开始搜索的字符索引@returns 如果找到匹配,返回[start_index, end_index]元组,否则返回False"""original_lines = original_content.split("\n")search_lines = search_content.split("\n")# 仅用于3+行的块if len(search_lines) < 3:return False# 如果存在尾随空行,则移除if search_lines and search_lines[-1] == "":search_lines.pop()first_line_search = search_lines[0].strip()last_line_search = search_lines[-1].strip()search_block_size = len(search_lines)# 找到start_index所在的行号start_line_num = 0current_index = 0while current_index < start_index and start_line_num < len(original_lines):current_index += len(original_lines[start_line_num]) + 1start_line_num += 1# 寻找匹配的起始和结束锚点for i in range(start_line_num, len(original_lines) - search_block_size + 1):# 检查第一行是否匹配if original_lines[i].strip() != first_line_search:continue# 检查预期位置的最后一行是否匹配if original_lines[i + search_block_size - 1].strip() != last_line_search:continue# 计算精确的字符位置match_start_index = 0for k in range(i):match_start_index += len(original_lines[k]) + 1match_end_index = match_start_indexfor k in range(search_block_size):match_end_index += len(original_lines[i + k]) + 1return match_start_index, match_end_indexreturn Falsedef construct_new_file_content(diff_content: str,original_content: str,is_final: bool,version: str = "v1",
) -> str:"""通过应用流式diff(特殊的SEARCH/REPLACE块格式)到原始文件内容来重建文件内容。它设计用于处理增量更新和所有块处理后的最终文件。diff格式是一种自定义结构,使用三个标记定义变化:------- SEARCH[原始文件中要查找的精确内容]=======[要替换的内容]+++++++ REPLACE行为和假设:1. 文件逐块处理。diff_content的每个块可能包含部分或完整的SEARCH/REPLACE块。通过对每个增量块调用此函数(is_final指示最后一个块),产生最终重建的文件内容。2. 匹配策略(按尝试顺序):a. 精确匹配:首先尝试在原始文件中找到精确的SEARCH块文本b. 行修剪匹配:回退到忽略前后空白的逐行比较c. 块锚点匹配:对于3+行的块,尝试使用首尾行作为锚点匹配如果所有匹配策略失败,则抛出错误。3. 空SEARCH部分:- 如果SEARCH为空且原始文件为空,表示创建新文件(纯插入)。- 如果SEARCH为空且原始文件不为空,表示完整文件替换(整个原始内容被视为匹配并替换)。4. 应用变化:- 在遇到"======="标记前,累积行作为搜索内容。- 在"======="后和">>>>>>> REPLACE"前,累积行作为替换内容。- 一旦块完成(">>>>>>> REPLACE"),原始文件中匹配的部分被累积的替换行替换,原始文件中的位置前进。5. 增量输出:- 一旦找到匹配位置并进入REPLACE部分,每个新替换行附加到结果,以便增量查看部分更新。6. 部分标记:- 如果块的最后一行看起来像是标记的一部分但不是已知标记,则移除它。这防止不完整或部分标记破坏输出。7. 最终化:- 一旦所有块处理完成(is_final为True),最后一个替换部分后的任何剩余原始内容附加到结果。- 不强制添加尾随换行。代码尝试精确输出指定的内容。错误:- 如果搜索块无法使用任何可用匹配策略匹配,则抛出错误。"""if version == "v1":return construct_new_file_content_v1(diff_content, original_content, is_final)elif version == "v2":return construct_new_file_content_v2(diff_content, original_content, is_final)else:raise ValueError(f"Invalid version '{version}' for file content constructor")def construct_new_file_content_v1(diff_content: str, original_content: str, is_final: bool) -> str:"""v1版本的construct_new_file_content实现"""result = ""last_processed_index = 0current_search_content = ""current_replace_content = ""in_search = Falsein_replace = Falsesearch_match_index = -1search_end_index = -1# 跟踪所有替换以处理乱序编辑replacements: list[dict[str, int | str]] = []pending_out_of_order_replacement = Falselines = diff_content.split("\n")# 如果最后一行看起来像是部分标记但未被识别,则移除它,因为它可能不完整。if lines:last_line = lines[-1]if (last_line.startswith(SEARCH_BLOCK_CHAR) orlast_line.startswith(LEGACY_SEARCH_BLOCK_CHAR) orlast_line.startswith("=") orlast_line.startswith(REPLACE_BLOCK_CHAR) orlast_line.startswith(LEGACY_REPLACE_BLOCK_CHAR)) and not (is_search_block_start(last_line) oris_search_block_end(last_line) oris_replace_block_end(last_line)):lines.pop()for line in lines:if is_search_block_start(line):in_search = Truecurrent_search_content = ""current_replace_content = ""continueif is_search_block_end(line):in_search = Falsein_replace = Trueif not current_search_content:# 空搜索块if len(original_content) == 0:# 新文件场景:无需匹配,直接开始插入search_match_index = 0search_end_index = 0else:# 错误:非空文件中的空搜索块表示SEARCH标记格式错误raise ValueError("Empty SEARCH block detected with non-empty file. This usually indicates a malformed SEARCH marker.\n""Please ensure your SEARCH marker follows the correct format:\n""- Use '------- SEARCH' (7+ dashes + space + SEARCH)\n")else:# 精确搜索匹配场景exact_index = original_content.find(current_search_content, last_processed_index)if exact_index != -1:search_match_index = exact_indexsearch_end_index = exact_index + len(current_search_content)else:# 尝试回退行修剪匹配line_match = line_trimmed_fallback_match(original_content, current_search_content,last_processed_index)if line_match:search_match_index, search_end_index = line_matchelse:# 对于较大块尝试块锚点回退block_match = block_anchor_fallback_match(original_content, current_search_content,last_processed_index)if block_match:search_match_index, search_end_index = block_matchelse:# 最后手段:从头搜索整个文件full_file_index = original_content.find(current_search_content, 0)if full_file_index != -1:# 在文件中找到 - 可能是乱序search_match_index = full_file_indexsearch_end_index = full_file_index + len(current_search_content)if search_match_index < last_processed_index:pending_out_of_order_replacement = Trueelse:raise ValueError(f"The SEARCH block:\n{current_search_content.rstrip()}\n...does not match anything in the file.")# 检查是否为乱序替换if search_match_index < last_processed_index:pending_out_of_order_replacement = True# 对于有序替换,输出直到匹配位置的一切if not pending_out_of_order_replacement:result += original_content[last_processed_index:search_match_index]continueif is_replace_block_end(line):# 完成一个替换块if search_match_index == -1:raise ValueError(f"The SEARCH block:\n{current_search_content.rstrip()}\n...is malformatted.")# 存储此替换replacements.append({"start": search_match_index,"end": search_end_index,"content": current_replace_content})# 如果这是有序替换,前进last_processed_indexif not pending_out_of_order_replacement:last_processed_index = search_end_index# 重置为下一个块in_search = Falsein_replace = Falsecurrent_search_content = ""current_replace_content = ""search_match_index = -1search_end_index = -1pending_out_of_order_replacement = Falsecontinue# 累积搜索或替换内容if in_search:current_search_content += line + "\n"elif in_replace:current_replace_content += line + "\n"# 仅对于有序替换立即输出替换行if search_match_index != -1 and not pending_out_of_order_replacement:result += line + "\n"# 如果这是最终块,我们需要应用所有替换并构建最终结果if is_final:# 处理在处理结束时仍处于替换模式的情况# 并视作遇到了REPLACE标记if in_replace and search_match_index != -1:# 存储此替换replacements.append({"start": search_match_index,"end": search_end_index,"content": current_replace_content})# 如果这是有序替换,前进last_processed_indexif not pending_out_of_order_replacement:last_processed_index = search_end_index# 重置状态in_search = Falsein_replace = Falsecurrent_search_content = ""current_replace_content = ""search_match_index = -1search_end_index = -1pending_out_of_order_replacement = False# 按起始位置排序替换replacements.sort(key=lambda x: x["start"])# 通过应用所有替换重建整个结果result = ""current_pos = 0for replacement in replacements:# 添加直到此替换的原始内容result += original_content[current_pos:replacement["start"]]# 添加替换内容result += replacement["content"]# 移动位置到替换部分之后current_pos = replacement["end"]# 添加任何剩余原始内容result += original_content[current_pos:]return resultfrom enum import IntFlagclass ProcessingState(IntFlag):Idle = 0StateSearch = 1 << 0StateReplace = 1 << 1class NewFileContentConstructor:"""v2版本的文件内容构建器类"""def __init__(self, original_content: str, is_final: bool):self.original_content = original_contentself.is_final = is_finalself.pending_non_standard_lines: list[str] = []self.result = ""self.last_processed_index = 0self.state = ProcessingState.Idleself.current_search_content = ""self.current_replace_content = ""self.search_match_index = -1self.search_end_index = -1def reset_for_next_block(self):"""重置为下一个块"""self.state = ProcessingState.Idleself.current_search_content = ""self.current_replace_content = ""self.search_match_index = -1self.search_end_index = -1def find_last_matching_line_index(self, regex: re.Pattern, line_limit: int) -> int:"""找到最后匹配行的索引"""for i in range(line_limit - 1, -1, -1):if regex.match(self.pending_non_standard_lines[i]):return ireturn -1def update_processing_state(self, new_state: ProcessingState):"""更新处理状态"""is_valid_transition = ((self.state == ProcessingState.Idle and new_state == ProcessingState.StateSearch) or(self.state == ProcessingState.StateSearch and new_state == ProcessingState.StateReplace))if not is_valid_transition:raise ValueError("Invalid state transition.\n""Valid transitions are:\n""- Idle → StateSearch\n""- StateSearch → StateReplace")self.state |= new_statedef is_state_active(self, state: ProcessingState) -> bool:"""检查状态是否激活"""return (self.state & state) == statedef activate_replace_state(self):"""激活替换状态"""self.update_processing_state(ProcessingState.StateReplace)def activate_search_state(self):"""激活搜索状态"""self.update_processing_state(ProcessingState.StateSearch)self.current_search_content = ""self.current_replace_content = ""def is_searching_active(self) -> bool:"""检查搜索是否激活"""return self.is_state_active(ProcessingState.StateSearch)def is_replacing_active(self) -> bool:"""检查替换是否激活"""return self.is_state_active(ProcessingState.StateReplace)def has_pending_non_standard_lines(self, pending_non_standard_line_limit: int) -> bool:"""检查是否有待处理的非标准行"""return len(self.pending_non_standard_lines) - pending_non_standard_line_limit < len(self.pending_non_standard_lines)def process_line(self, line: str):"""处理一行"""self.internal_process_line(line, True, len(self.pending_non_standard_lines))def get_result(self) -> str:"""获取结果"""# 如果是最终块,附加任何剩余原始内容if self.is_final and self.last_processed_index < len(self.original_content):self.result += self.original_content[self.last_processed_index:]if self.is_final and self.state != ProcessingState.Idle:raise ValueError("File processing incomplete - SEARCH/REPLACE operations still active during finalization")return self.resultdef internal_process_line(self,line: str,can_write_pending_non_standard_lines: bool,pending_non_standard_line_limit: int,) -> int:"""内部处理一行"""remove_line_count = 0if is_search_block_start(line):remove_line_count = self.trim_pending_non_standard_trailing_empty_lines(pending_non_standard_line_limit)if remove_line_count > 0:pending_non_standard_line_limit -= remove_line_countif self.has_pending_non_standard_lines(pending_non_standard_line_limit):self.try_fix_search_replace_block(pending_non_standard_line_limit)if can_write_pending_non_standard_lines:self.pending_non_standard_lines.clear()self.activate_search_state()elif is_search_block_end(line):# 校验非标内容if not self.is_searching_active():self.try_fix_search_block(pending_non_standard_line_limit)if can_write_pending_non_standard_lines:self.pending_non_standard_lines.clear()self.activate_replace_state()self.before_replace()elif is_replace_block_end(line):if not self.is_replacing_active():self.try_fix_replace_block(pending_non_standard_line_limit)if can_write_pending_non_standard_lines:self.pending_non_standard_lines.clear()self.last_processed_index = self.search_end_indexself.reset_for_next_block()else:# 累积搜索或替换内容if self.is_replacing_active():self.current_replace_content += line + "\n"# 如果知道插入点,立即输出替换行if self.search_match_index != -1:self.result += line + "\n"elif self.is_searching_active():self.current_search_content += line + "\n"else:if can_write_pending_non_standard_lines:# 处理非标内容self.pending_non_standard_lines.append(line)return remove_line_countdef before_replace(self):"""替换前处理"""if not self.current_search_content:# 空搜索块if len(self.original_content) == 0:# 新文件场景:无需匹配,直接开始插入self.search_match_index = 0self.search_end_index = 0else:# 完整文件替换场景:将整个文件视为匹配self.search_match_index = 0self.search_end_index = len(self.original_content)else:# 精确搜索匹配场景exact_index = self.original_content.find(self.current_search_content, self.last_processed_index)if exact_index != -1:self.search_match_index = exact_indexself.search_end_index = exact_index + len(self.current_search_content)else:# 尝试回退行修剪匹配line_match = line_trimmed_fallback_match(self.original_content, self.current_search_content,self.last_processed_index)if line_match:self.search_match_index, self.search_end_index = line_matchelse:# 对于较大块尝试块锚点回退block_match = block_anchor_fallback_match(self.original_content, self.current_search_content,self.last_processed_index)if block_match:self.search_match_index, self.search_end_index = block_matchelse:raise ValueError(f"The SEARCH block:\n{self.current_search_content.rstrip()}\n...does not match anything in the file.")if self.search_match_index < self.last_processed_index:raise ValueError(f"The SEARCH block:\n{self.current_search_content.rstrip()}\n...matched an incorrect content in the file.")# 输出直到匹配位置的一切self.result += self.original_content[self.last_processed_index:self.search_match_index]def try_fix_search_block(self, line_limit: int) -> int:"""尝试修复搜索块"""remove_line_count = 0if line_limit < 0:line_limit = len(self.pending_non_standard_lines)if not line_limit:raise ValueError("Invalid SEARCH/REPLACE block structure - no lines available to process")search_tag_regex = re.compile(r"^([-]{3,}|[<]{3,}) SEARCH$")search_tag_index = self.find_last_matching_line_index(search_tag_regex, line_limit)if search_tag_index != -1:fix_lines = self.pending_non_standard_lines[search_tag_index:line_limit]fix_lines[0] = SEARCH_BLOCK_STARTfor line in fix_lines:remove_line_count += self.internal_process_line(line, False, search_tag_index)else:raise ValueError(f"Invalid REPLACE marker detected - could not find matching SEARCH block starting from line {search_tag_index + 1}")return remove_line_countdef try_fix_replace_block(self, line_limit: int) -> int:"""尝试修复替换块"""remove_line_count = 0if line_limit < 0:line_limit = len(self.pending_non_standard_lines)if not line_limit:raise ValueError()replace_begin_tag_regex = re.compile(r"^[=]{3,}$")replace_begin_tag_index = self.find_last_matching_line_index(replace_begin_tag_regex, line_limit)if replace_begin_tag_index != -1:fix_lines = self.pending_non_standard_lines[replace_begin_tag_index - remove_line_count: line_limit - remove_line_count]fix_lines[0] = SEARCH_BLOCK_ENDfor line in fix_lines:remove_line_count += self.internal_process_line(line, False,replace_begin_tag_index - remove_line_count)else:raise ValueError(f"Malformed REPLACE block - missing valid separator after line {replace_begin_tag_index + 1}")return remove_line_countdef try_fix_search_replace_block(self, line_limit: int) -> int:"""尝试修复搜索替换块"""remove_line_count = 0if line_limit < 0:line_limit = len(self.pending_non_standard_lines)if not line_limit:raise ValueError()replace_end_tag_regex = re.compile(r"^([+]{3,}|[>]{3,}) REPLACE$")replace_end_tag_index = self.find_last_matching_line_index(replace_end_tag_regex, line_limit)like_replace_end_tag = replace_end_tag_index == line_limit - 1if like_replace_end_tag:fix_lines = self.pending_non_standard_lines[replace_end_tag_index - remove_line_count: line_limit - remove_line_count]fix_lines[-1] = REPLACE_BLOCK_ENDfor line in fix_lines:remove_line_count += self.internal_process_line(line, False, replace_end_tag_index - remove_line_count)else:raise ValueError("Malformed SEARCH/REPLACE block structure: Missing valid closing REPLACE marker")return remove_line_countdef trim_pending_non_standard_trailing_empty_lines(self, line_limit: int) -> int:"""从pending_non_standard_lines数组中移除尾随空行@param line_limit - 开始检查的索引(独占)。从line_limit-1向后移除空行。@returns 移除的空行数量"""removed_count = 0i = min(line_limit, len(self.pending_non_standard_lines)) - 1while i >= 0 and self.pending_non_standard_lines[i].strip() == "":self.pending_non_standard_lines.pop()removed_count += 1i -= 1return removed_countdef construct_new_file_content_v2(diff_content: str, original_content: str, is_final: bool) -> str:"""v2版本的construct_new_file_content实现"""new_file_content_constructor = NewFileContentConstructor(original_content, is_final)lines = diff_content.split("\n")# 如果最后一行看起来像是部分标记但未被识别,则移除它,因为它可能不完整。if lines:last_line = lines[-1]if (last_line.startswith(SEARCH_BLOCK_CHAR) orlast_line.startswith(LEGACY_SEARCH_BLOCK_CHAR) orlast_line.startswith("=") orlast_line.startswith(REPLACE_BLOCK_CHAR) orlast_line.startswith(LEGACY_REPLACE_BLOCK_CHAR)) and (last_line != SEARCH_BLOCK_START andlast_line != SEARCH_BLOCK_END andlast_line != REPLACE_BLOCK_END):lines.pop()for line in lines:new_file_content_constructor.process_line(line)result = new_file_content_constructor.get_result()return result
demo 演示
详见 https://github.com/AriesYB/cline-tools/blob/main/demo/main.py
项目结构
cline-tools/
├── demo/ # 示例代码
│ ├── main.py # 使用示例
│ └── old.html # 旧版HTML文件
├── llm_tools/ # LLM工具相关
│ ├── edit_file_by_diff.py # 文件编辑工具
│ └── tool_annotation.py # 工具注解
├── tools/ # 核心工具实现
│ └── edit_tool.py # 文件编辑核心逻辑
├── llm_msg_parser.py # LLM消息解析器
├── prompt.py # 提示词模板
└── pyproject.toml # 项目配置