当前位置: 首页 > news >正文

Manus AI 原理深度解析第三篇:Tools

前言

在前两篇文章里面,介绍了Manus的Prompt Manus AI 原理深度解析第一篇:Prompt以及 Modules & Agent Loop Manus AI 原理深度解析第二篇:Modules & Agent Loop,那么这一篇就介绍剩下的Tools了。

Tools 是 Manus 与外部环境交互的具体执行单元,覆盖文件、终端、浏览器、网络等操作,支持多语言和多场景需求。接下来不多废话,直入主题分析。

Message Tools

用于处理用户消息的工具类。

message_notify_user

向用户发送一条无需回复的消息的工作。

用于确认收到消息、提供进度更新、报告任务完成情况或解释方法变更。

  {"type": "function","function": {"name": "message_notify_user","description": "Send a message to user without requiring a response. Use for acknowledging receipt of messages, providing progress updates, reporting task completion, or explaining changes in approach.","parameters": {"type": "object","properties": {"text": {"type": "string","description": "Message text to display to user"},"attachments": {"anyOf": [{"type": "string"},{"items": {"type": "string"}, "type": "array"}],"description": "(Optional) List of attachments to show to user, can be file paths or URLs"}},"required": ["text"]}}}

message_ask_user

向用户提问并等待回复的工具。

用于请求澄清、确认或收集更多信息。

  {"type": "function","function": {"name": "message_ask_user","description": "Ask user a question and wait for response. Use for requesting clarification, asking for confirmation, or gathering additional information.","parameters": {"type": "object","properties": {"text": {"type": "string","description": "Question text to present to user"},"attachments": {"anyOf": [{"type": "string"},{"items": {"type": "string"}, "type": "array"}],"description": "(Optional) List of question-related files or reference materials"},"suggest_user_takeover": {"type": "string","enum": ["none", "browser"],"description": "(Optional) Suggested operation for user takeover"}},"required": ["text"]}}}

File Processing Tools

用于处理文件的工具类。

file_read

读取文件内容的工具。

用于检查文件内容、分析日志或读取配置文件。

  {"type": "function","function": {"name": "file_read","description": "Read file content. Use for checking file contents, analyzing logs, or reading configuration files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to read"},"start_line": {"type": "integer","description": "(Optional) Starting line to read from, 0-based"},"end_line": {"type": "integer","description": "(Optional) Ending line number (exclusive)"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file"]}}}

file_write

覆盖或附加内容到文件的工具。

用于创建新文件、附加内容或修改现有文件。

  {"type": "function","function": {"name": "file_write","description": "Overwrite or append content to a file. Use for creating new files, appending content, or modifying existing files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to write to"},"content": {"type": "string","description": "Text content to write"},"append": {"type": "boolean","description": "(Optional) Whether to use append mode"},"leading_newline": {"type": "boolean","description": "(Optional) Whether to add a leading newline"},"trailing_newline": {"type": "boolean","description": "(Optional) Whether to add a trailing newline"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "content"]}}}

file_str_replace

替换文件中的指定字符串的工具。

用于更新文件中的特定内容或修复代码中的错误。

  {"type": "function","function": {"name": "file_str_replace","description": "Replace specified string in a file. Use for updating specific content in files or fixing errors in code.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to perform replacement on"},"old_str": {"type": "string","description": "Original string to be replaced"},"new_str": {"type": "string","description": "New string to replace with"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "old_str", "new_str"]}}}

file_find_in_content

在文件内容中搜索匹配的文本工具。

用于查找文件中的特定内容或模式。

  {"type": "function","function": {"name": "file_find_in_content","description": "Search for matching text within file content. Use for finding specific content or patterns in files.","parameters": {"type": "object","properties": {"file": {"type": "string","description": "Absolute path of the file to search within"},"regex": {"type": "string","description": "Regular expression pattern to match"},"sudo": {"type": "boolean","description": "(Optional) Whether to use sudo privileges"}},"required": ["file", "regex"]}}}

file_find_by_name

在指定目录中按名称模式查找文件的工具。

用于定位具有特定命名模式的文件。

  {"type": "function","function": {"name": "file_find_by_name","description": "Find files by name pattern in specified directory. Use for locating files with specific naming patterns.","parameters": {"type": "object","properties": {"path": {"type": "string","description": "Absolute path of directory to search"},"glob": {"type": "string","description": "Filename pattern using glob syntax wildcards"}},"required": ["path", "glob"]}}}

Bash/Shell Tools

用于命令行等操作的工具类。

shell_exec

在指定的 shell 会话中执行命令的工具。

用于运行代码、安装包或管理文件。

  {"type": "function","function": {"name": "shell_exec","description": "Execute commands in a specified shell session. Use for running code, installing packages, or managing files.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"exec_dir": {"type": "string","description": "Working directory for command execution (must use absolute path)"},"command": {"type": "string","description": "Shell command to execute"}},"required": ["id", "exec_dir", "command"]}}}

shell_view

查看指定 shell 会话的内容的工具。

用于检查命令执行结果或监控输出。

  {"type": "function","function": {"name": "shell_view","description": "View the content of a specified shell session. Use for checking command execution results or monitoring output.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"}},"required": ["id"]}}}

shell_wait

等待指定 shell 会话中正在运行的进程返回的工具。

在运行需要较长运行时间的命令后使用。

  {"type": "function","function": {"name": "shell_wait","description": "Wait for the running process in a specified shell session to return. Use after running commands that require longer runtime.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"seconds": {"type": "integer","description": "Wait duration in seconds"}},"required": ["id"]}}}

shell_write_to_process

将输入写入指定 shell 会话中正在运行的进程的工具。

用于响应交互式命令提示符。

  {"type": "function","function": {"name": "shell_write_to_process","description": "Write input to a running process in a specified shell session. Use for responding to interactive command prompts.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"},"input": {"type": "string","description": "Input content to write to the process"},"press_enter": {"type": "boolean","description": "Whether to press Enter key after input"}},"required": ["id", "input", "press_enter"]}}}

shell_kill_process

终止指定 shell 会话中正在运行的进程的工具。

用于停止长时间运行的进程或处理冻结的命令。

  {"type": "function","function": {"name": "shell_kill_process","description": "Terminate a running process in a specified shell session. Use for stopping long-running processes or handling frozen commands.","parameters": {"type": "object","properties": {"id": {"type": "string","description": "Unique identifier of the target shell session"}},"required": ["id"]}}}

Browser-use Tools

用于浏览器处理相关的工具。

browser_view

查看当前浏览器页面的内容的工具。用于检查之前打开的页面的最新状态。

  {"type": "function","function": {"name": "browser_view","description": "View content of the current browser page. Use for checking the latest state of previously opened pages.","parameters": {"type": "object"}}}

browser_navigate

将浏览器导航至指定的 URL的工具。

需要访问新页面时使用。

  {"type": "function","function": {"name": "browser_navigate","description": "Navigate browser to specified URL. Use when accessing new pages is needed.","parameters": {"type": "object","properties": {"url": {"type": "string","description": "Complete URL to visit. Must include protocol prefix."}},"required": ["url"]}}}

browser_restart

重启浏览器并导航到指定的 URL的工具。

当需要重置浏览器状态时使用。

  {"type": "function","function": {"name": "browser_restart","description": "Restart browser and navigate to specified URL. Use when browser state needs to be reset.","parameters": {"type": "object","properties": {"url": {"type": "string","description": "Complete URL to visit after restart. Must include protocol prefix."}},"required": ["url"]}}}

browser_click

点击当前浏览器页面中的元素的工具。

需要点击页面元素时使用。

  {"type": "function","function": {"name": "browser_click","description": "Click on elements in the current browser page. Use when clicking page elements is needed.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "(Optional) Index number of the element to click"},"coordinate_x": {"type": "number","description": "(Optional) X coordinate of click position"},"coordinate_y": {"type": "number","description": "(Optional) Y coordinate of click position"}}}}}

browser_input

覆盖当前浏览器页面上可编辑元素中的文本的工具。

在输入字段中填充内容时使用。

  {"type": "function","function": {"name": "browser_input","description": "Overwrite text in editable elements on the current browser page. Use when filling content in input fields.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "(Optional) Index number of the element to overwrite text"},"coordinate_x": {"type": "number","description": "(Optional) X coordinate of the element to overwrite text"},"coordinate_y": {"type": "number","description": "(Optional) Y coordinate of the element to overwrite text"},"text": {"type": "string","description": "Complete text content to overwrite"},"press_enter": {"type": "boolean","description": "Whether to press Enter key after input"}},"required": ["text", "press_enter"]}}}

browser_move_mouse

将光标移动到当前浏览器页面上的指定位置的工具。

用于模拟用户鼠标移动。

  {"type": "function","function": {"name": "browser_move_mouse","description": "Move cursor to specified position on the current browser page. Use when simulating user mouse movement.","parameters": {"type": "object","properties": {"coordinate_x": {"type": "number","description": "X coordinate of target cursor position"},"coordinate_y": {"type": "number","description": "Y coordinate of target cursor position"}},"required": ["coordinate_x", "coordinate_y"]}}}

browser_press_key

在当前浏览器页面中模拟按键操作的工具。

当需要特定的键盘操作时使用。

  {"type": "function","function": {"name": "browser_press_key","description": "Simulate key press in the current browser page. Use when specific keyboard operations are needed.","parameters": {"type": "object","properties": {"key": {"type": "string","description": "Key name to simulate (e.g., Enter, Tab, ArrowUp), supports key combinations (e.g., Control+Enter)."}},"required": ["key"]}}}

browser_select_option

从当前浏览器页面的下拉列表元素中选择特定选项的工具。

用于选择下拉菜单选项。

  {"type": "function","function": {"name": "browser_select_option","description": "Select specified option from dropdown list element in the current browser page. Use when selecting dropdown menu options.","parameters": {"type": "object","properties": {"index": {"type": "integer","description": "Index number of the dropdown list element"},"option": {"type": "integer","description": "Option number to select, starting from 0."}},"required": ["index", "option"]}}}

browser_scroll_up

向上滚动当前浏览器页面的工具。

用于查看上方内容或返回页面顶部。

  {"type": "function","function": {"name": "browser_scroll_up","description": "Scroll up the current browser page. Use when viewing content above or returning to page top.","parameters": {"type": "object","properties": {"to_top": {"type": "boolean","description": "(Optional) Whether to scroll directly to page top instead of one viewport up."}}}}}

browser_scroll_down

向下滚动当前浏览器页面的工具。

用于查看下方内容或跳转至页面底部。

  {"type": "function","function": {"name": "browser_scroll_down","description": "Scroll down the current browser page. Use when viewing content below or jumping to page bottom.","parameters": {"type": "object","properties": {"to_bottom": {"type": "boolean","description": "(Optional) Whether to scroll directly to page bottom instead of one viewport down."}}}}}

browser_console_exec

在浏览器控制台中执行 JavaScript 代码的工具。

当需要执行自定义脚本时使用。

  {"type": "function","function": {"name": "browser_console_exec","description": "Execute JavaScript code in browser console. Use when custom scripts need to be executed.","parameters": {"type": "object","properties": {"javascript": {"type": "string","description": "JavaScript code to execute. Note that the runtime environment is browser console."}},"required": ["javascript"]}}}

browser_console_view

查看浏览器控制台输出的工具。

用于检查 JavaScript 日志或调试页面错误。

  {"type": "function","function": {"name": "browser_console_view","description": "View browser console output. Use when checking JavaScript logs or debugging page errors.","parameters": {"type": "object","properties": {"max_lines": {"type": "integer","description": "(Optional) Maximum number of log lines to return."}}}}}

Web Search Tools

用于联网搜索的工具类。

info_search_web

使用搜索引擎搜索网页的工具。

用于获取最新信息或查找参考资料。

  {"type": "function","function": {"name": "info_search_web","description": "Search web pages using search engine. Use for obtaining latest information or finding references.","parameters": {"type": "object","properties": {"query": {"type": "string","description": "Search query in Google search style, using 3-5 keywords."},"date_range": {"type": "string","enum": ["all", "past_hour", "past_day", "past_week", "past_month", "past_year"],"description": "(Optional) Time range filter for search results."}},"required": ["query"]}}}

Deploy Tools

用于部署、启动项目的工具类。

deploy_expose_port

公开指定的本地端口以进行临时公共访问的工具。

用于为服务提供临时公共访问。

  {"type": "function","function": {"name": "deploy_expose_port","description": "Expose specified local port for temporary public access. Use when providing temporary public access for services.","parameters": {"type": "object","properties": {"port": {"type": "integer","description": "Local port number to expose"}},"required": ["port"]}}},

deploy_apply_deployment

将网站或应用程序部署到公共生产环境的工具。

用于部署或更新静态网站或应用程序。

  {"type": "function","function": {"name": "deploy_apply_deployment","description": "Deploy website or application to public production environment. Use when deploying or updating static websites or applications.","parameters": {"type": "object","properties": {"type": {"type": "string","enum": ["static", "nextjs"],"description": "Type of website or application to deploy."},"local_dir": {"type": "string","description": "Absolute path of local directory to deploy."}},"required": ["type", "local_dir"]}}}

Other Tools

补充其他必要与非必要操作的工具类。

make_manus_page

从本地 MDX 文件制作手册页的工具。

  {"type": "function","function": {"name": "make_manus_page","description": "Make a Manus Page from a local MDX file.","parameters": {"type": "object","properties": {"mdx_file_path": {"type": "string","description": "Absolute path of the source MDX file"}},"required": ["mdx_file_path"]}}}

idle

一种特殊工具,用于指示您已完成所有任务并即将进入空闲状态。

  {"type": "function","function": {"name": "idle","description": "A special tool to indicate you have completed all tasks and are about to enter idle state.","parameters": {"type": "object"}}}

附录

Brwoser-use Framework

在这里插入图片描述

OpenManus架构图

在这里插入图片描述

AI Agent 基础架构

在这里插入图片描述

http://www.dtcms.com/a/192952.html

相关文章:

  • 电总协议调试助手更新-PowerBus-v1.0.5
  • 作业帮Java后台开发面试题及参考答案(下)
  • ACI Fabric 中的各种地址
  • OneNote内容太多插入标记卡死的解决办法
  • 汽配知识(三)|跨境电商平台的汽配类目划分与关键词逻辑
  • Hive PredicatePushDown 谓词下推规则的计算逻辑
  • 嵌入式学习笔记DAY21(双向链表、Makefile)
  • 盲盒:拆开未知的惊喜,收藏生活的仪式感
  • 养生:解锁健康生活的核心密码
  • js在浏览器执行原理
  • golang -- 认识channel底层结构
  • AI软件汇总与功能解析:赋能未来的智能工具库
  • 以项目的方式学QT开发(二)——超详细讲解(120000多字详细讲解,涵盖qt大量知识)逐步更新!
  • mysql 基础复习-安装部署、增删改查 、视图、触发器、存储过程、索引、备份恢复迁移、分库分表
  • 8、SpringBoot集成MinIO
  • 鸿蒙OSUniApp 制作简洁高效的标签云组件#三方框架 #Uniapp
  • 插槽(Slot)的使用方法
  • GPUGeek云平台实战:DeepSeek-R1-70B大语言模型一站式部署
  • 应用BERT-GCN跨模态情绪分析:贸易缓和与金价波动的AI归因
  • buildroot使用外部编译链编译bluez蓝牙工具
  • MySQL-数据库分布式XA事务
  • 连接指定数据库时提示not currently accepting connections
  • Golang基础知识—cond
  • LM2902:一款高性能四运算放大器的解析
  • 蓝桥杯 2024 C++国 B最小字符串
  • 论文学习_Directed Greybox Fuzzing
  • 《MySQL:MySQL视图特性》
  • rsync入门笔记
  • 第30节:现代CNN架构-轻量级架构EfficientNet
  • 【YOLO 系列】基于YOLO的道路坑洞检测识别系统【python源码+Pyqt5界面+数据集+训练代码】