当前位置：首页 > news >正文

深入理解cursor 中的mcp工作原理

news 2025/11/13 6:17:59

这是一个非常深入的问题，触及了现代 AI 助手（如 Cursor）的“代理”（Agentic）架构核心。

{"mcpServers": {"playwright": {"command": "npx","args": ["-y","@playwright/mcp"],"env": {"PLAYWRIGHT_BROWSERS_PATH": "0"}},"Sequential thinking": {"command": "npx","args": ["-y","@modelcontextprotocol/server-sequential-thinking"]},

提供的 mcpServers JSON 片段是 Cursor 后端的配置文件，它告诉 Cursor 如何启动和管理外部的、专门的“工具”服务器。

MCP 代表 Model Context Protocol（模型上下文协议）。这些不是模型本身，而是模型可以调用的专业工具。

简而言之：主模型（如 Gemini/GPT-4）是“大脑”，而 MCP 服务器是“专业的手脚”（如 Playwright 是“浏览器的手”，Sequential thinking 是“逻辑的手”）。 它们通过一个高度结构化的“工具调用”流程进行交互。

详细步骤：MCP 与模型的交互流程

以下是当您发出一个需要 playwright 的请求时，所发生的完整步骤：

阶段一：模型决策（大脑的“思考”）

用户提问： 您向 Cursor 提问，例如：“请访问 example.com 并告诉我页面标题。”
后端增强： Cursor 后端接收到您的问题。它不会将您的问题直接发送给 playwright。相反，它会将您的问题与一个巨大的系统提示（System Prompt，就像我们之前翻译的那些）合并。
工具呈递： 这个系统提示中包含了模型（例如 Gemini 或 GPT-4）所有可用工具的文本描述，其中会有一个工具类似于 run_playwright_test(url)。
模型决策： 主 AI 模型（Gemini）分析您的请求（“访问…并告诉我标题”）和它可用的工具列表。它判断出，要完成这个任务，它必须调用 run_playwright_test 工具。
生成工具调用： 模型的输出不是给您的答案，而是一个结构化的工具调用请求（例如 XML 或 JSON 格式），类似于：
```
<tool_call><recipient_name>playwright</recipient_name><parameters><url>example.com</url><task>"Get page title"</task></parameters>
</tool_call>
```

阶段二：后端执行（中枢神经系统的“调度”）

后端拦截： Cursor 后端拦截了这个工具调用请求。它不会把这个 XML 显示给您。
查找配置： 后端在自己的配置（您提供的 mcpServers.json）中查找 playwright。
启动 MCP 服务器： 后端看到了这个配置：
```
"playwright": {"command": "npx","args": ["-y", "@playwright/mcp"],...
}
```
如果这个 MCP 服务器尚未运行，Cursor 后端会在本地或云端执行这个命令 (npx -y @playwright/mcp) 来启动一个专门的 Playwright 服务器。
任务委托： Cursor 后端将第 5 步中模型生成的参数（url: "example.com"）通过 MCP 协议发送给这个刚刚启动的 playwright 服务器。

阶段三：MCP 服务器工作（“手脚”的执行）

执行任务： @playwright/mcp 这个服务器接收到指令，它才是实际运行 Playwright 脚本、启动浏览器、访问网页、抓取标题的进程。
返回结果： playwright 服务器完成任务后，将结果（例如 {"title": "Example Domain"}）打包，通过 MCP 协议发回给 Cursor 后端。

阶段四：模型总结（大脑的“响应”）

结果格式化： Cursor 后端接收到 playwright 的原始结果，并将其格式化为工具结果（Tool Result）的 XML，然后再次发送给主 AI 模型：
```
<tool_result><recipient_name>playwright</recipient_name><status>success</status><content>{"title": "Example Domain"}</content>
</tool_result>
```
模型总结： 主 AI 模型（Gemini）现在接收到了这个“工具结果”。它读取这个结果，并将其翻译成一句通顺的、人类可读的回复。
最终答案： 模型生成最终答案（例如：“我访问了该网站，页面标题是‘Example Domain’。”），Cursor 后端最后将这句话显示给您。

交互流程图

这是一个简化的文本流程图，展示了四个关键角色之间的交互：

   [您 (User)]              [Cursor 后端 (Orchestrator)]           [主 AI 模型 (Gemini/GPT-4)]                [MCP 服务器 (Playwright)]|                                |                                   |                                        || 1. "帮我测试 example.com"          |                                   |                                        ||----------------------------------->|                                   |                                        ||                                | 2. (添加工具列表) "请处理：..."    |                                        ||                                |----------------------------------->|                                        ||                                |                                   | 3. (思考...)                           ||                                |                                   |    (决定调用 playwright)               ||                                | 4. <tool_call name="playwright">   |                                        ||                                |<-----------------------------------|                                        ||                                | 5. (查找 mcpServers.json)           |                                        ||                                |    (启动 "npx @playwright/mcp")     |                                        ||                                | 6. (发送任务 "example.com")         |                                        ||                                |---------------------------------------------------------------------------->||                                |                                   |                                        | 10. (运行 Playwright, 访问网页)|                                |                                   |                                        | 11. (返回结果: "Title is Example")|                                | 12. (收到结果)                       |<---------------------------------------||                                |<-----------------------------------|                                        ||                                | 13. <tool_result content="...">   |                                        ||                                |----------------------------------->|                                        ||                                |                                   | 14. (思考...)                          ||                                |                                   |     (将结果转换为人类语言)             ||                                | 15. "我访问了，标题是 'Example'。" |                                        ||                                |<-----------------------------------|                                        || 16. (看到最终答案)                  |                                   |                                        ||<-----------------------------------|                                   |                                        |

首先，我必须澄清一个非常重要的概念，这也是理解整个流程的关键：

模型 (Gemini/GPT) 不知道什么是 MCP。

模型：只知道如何根据系统提示词（System Prompt）中的工具定义，生成“工具调用” (Tool Call) JSON/XML。
MCP (模型上下文协议)：这不是模型使用的协议，而是**AI 助手后端（Orchestrator，编排器）用来与外部工具服务器（Tool Server）**通信的协议。
您提供的 JSON：这是编排器的配置文件，告诉它：“当你需要运行 playwright 这个工具时，你应该通过执行 npx -y @playwright/mcp 这个命令来启动它的 MCP 服务器，然后通过标准输入/输出 (STDIO) 与之通信。”

因此，在 MCP 交互中：

MCP 客户端 = AI 助手后端 / 编排器 (例如 Cursor 的后端)。
MCP 服务端 = 您启动的专用工具进程 (例如 npx @playwright/mcp 或我们即将编写的脚本)。

模型是“触发者”，编排器是“客户端”，工具进程是“服务端”。

MCP 的实现原理 (基于 STDIN/STDOUT)

根据您提供的 command 和 args 配置，这种 MCP 实现不使用 HTTP。它使用标准输入/输出 (STDIO) 进行进程间通信 (IPC)。

客户端 (编排器)：
1. 启动 command + args 指定的子进程。
2. 通过 stdin（标准输入）向子进程写入 JSON 格式的请求。
3. 通过 stdout（标准输出）读取子进程返回的 JSON 格式的响应。
服务端 (工具)：
1. 启动后，进入一个无限循环。
2. 监听 stdin。
3. 读取一行输入，解析 JSON 请求。
4. 执行请求描述的任务（例如写文件、运行 playwright）。
5. 将执行结果格式化为 JSON，写入 stdout 并刷新 (flush) 输出缓冲区。

Python 实现案例

我们将创建一个简单的“文件写入”工具 (file_writer)。

1. MCP 服务端 (mcp_server.py)

这是一个独立的脚本，它会一直运行，等待来自 stdin 的指令。

#!/usr/bin/env python
import sys
import json
import os# 简单的日志记录（写入 stderr，这样就不会干扰 stdout 的 JSON 响应）
def log(message):print(f"[Server Log] {message}", file=sys.stderr, flush=True)def handle_request(request):"""处理解析后的 JSON 请求"""if request.get("method") == "write_file":try:params = request.get("params", {})path = params.get("path")content = params.get("content")if not path or content is None:raise ValueError("缺少 'path' 或 'content' 参数")# 确保目录存在os.makedirs(os.path.dirname(path), exist_ok=True)# 执行工具的核心任务with open(path, 'w', encoding='utf-8') as f:f.write(content)log(f"文件已写入: {path}")# 返回成功的 JSON-RPC 响应return {"jsonrpc": "2.0","id": request.get("id"),"result": {"status": "success", "path": path}}except Exception as e:log(f"错误: {e}")return {"jsonrpc": "2.0","id": request.get("id"),"error": {"code": -1, "message": str(e)}}else:return {"jsonrpc": "2.0","id": request.get("id"),"error": {"code": -2, "message": "不支持的方法"}}def main():"""MCP 服务端主循环"""log("MCP 服务端已启动，等待 stdin...")while True:try:# 1. 从 stdin 读取一行指令line = sys.stdin.readline()# 如果 stdin 关闭，退出循环if not line:log("Stdin 关闭，正在退出。")breaklog(f"收到原始请求: {line.strip()}")# 2. 解析 JSON 请求request = json.loads(line)# 3. 处理请求response = handle_request(request)# 4. 将 JSON 响应写入 stdoutresponse_json = json.dumps(response)print(response_json, flush=True) # flush=True 是必须的！except json.JSONDecodeError:log("JSON 解析错误")print(json.dumps({"jsonrpc": "2.0", "error": {"code": -3, "message": "无效的 JSON"}}), flush=True)except Exception as e:log(f"发生意外错误: {e}")breakif __name__ == "__main__":main()

2. MCP 客户端 (orchestrator.py)

这是模拟 AI 助手后端的脚本。它会启动 mcp_server.py 并向其发送任务。

import subprocess
import json
import time# 1. 模拟 AI 模型生成的 "工具调用"
model_tool_call = {"tool_name": "file_writer","parameters": {"path": "./test_output/demo.txt","content": "Hello from the MCP server!"}
}# 2. (编排器) 将模型调用转换为 MCP (JSON-RPC) 请求
# 我们自定义的协议：方法名是 "write_file"，参数是 "path" 和 "content"
mcp_request = {"jsonrpc": "2.0","id": 1,"method": "write_file","params": model_tool_call["parameters"]
}print(f"[Client] 准备启动 MCP 服务端...")
print(f"[Client] 模拟的模型工具调用: {model_tool_call}")# 3. 启动 MCP 服务端子进程（就像 mcpServers.json 中的配置一样）
# bufsize=1 表示行缓冲
# text=True (或 universal_newlines=True) 确保 stdin/stdout 是文本模式
proc = subprocess.Popen(['python', 'mcp_server.py'],stdin=subprocess.PIPE,stdout=subprocess.PIPE,stderr=subprocess.PIPE, # 捕获日志text=True,bufsize=1 
)print(f"[Client] 服务端已启动 (PID: {proc.pid})")try:# 4. (客户端) 将 JSON 请求写入服务端的 stdinrequest_json = json.dumps(mcp_request)print(f"[Client] -> 发送请求到 STDIN: {request_json}")proc.stdin.write(request_json + '\n') # 必须有换行符proc.stdin.flush() # 必须刷新# 5. (客户端) 从服务端的 stdout 读取响应print("[Client] <- 等待来自 STDOUT 的响应...")response_json = proc.stdout.readline()if not response_json:print("[Client] 错误: 未从服务端收到响应。")else:print(f"[Client] <- 收到原始响应: {response_json.strip()}")response = json.loads(response_json)print(f"[Client] 解析后的响应: {response}")# 6. (编排器) 将 MCP 响应转换为 "工具结果" 以便发回给模型if "result" in response:model_tool_result = {"tool_name": "file_writer","status": "success","content": response["result"]}else:model_tool_result = {"tool_name": "file_writer","status": "error","content": response.get("error")}print(f"[Client] 准备发回给模型的工具结果: {model_tool_result}")finally:# 7. 清理print("[Client] 正在终止服务端进程...")proc.stdin.close()proc.stdout.close()proc.terminate()proc.wait(timeout=2)print(f"[Client] 服务端日志:\n{proc.stderr.read()}")print("[Client] 流程结束。")

如何运行：

保存 mcp_server.py 和 orchestrator.py。
在终端中只运行 python orchestrator.py。

PHP 实现案例

我们将使用 PHP 实现完全相同的逻辑。

1. MCP 服务端 (mcp_server.php)

#!/usr/bin/env php
<?php// 日志函数 (写入 stderr)
function log_msg($message) {fwrite(STDERR, "[Server Log] " . $message . PHP_EOL);fflush(STDERR);
}// 请求处理函数
function handle_request($request) {if (isset($request['method']) && $request['method'] == 'write_file') {try {$params = $request['params'] ?? [];$path = $params['path'] ?? null;$content = $params['content'] ?? null;if (!$path || $content === null) {throw new Exception("缺少 'path' 或 'content' 参数");}// 确保目录存在$dir = dirname($path);if (!is_dir($dir)) {mkdir($dir, 0777, true);}// 执行核心任务file_put_contents($path, $content);log_msg("文件已写入: $path");// 返回成功响应return ["jsonrpc" => "2.0","id" => $request['id'] ?? null,"result" => ["status" => "success", "path" => $path]];} catch (Exception $e) {log_msg("错误: " . $e->getMessage());return ["jsonrpc" => "2.0","id" => $request['id'] ?? null,"error" => ["code" => -1, "message" => $e->getMessage()]];}} else {return ["jsonrpc" => "2.0","id" => $request['id'] ?? null,"error" => ["code" => -2, "message" => "不支持的方法"]];}
}// --- 服务端主循环 ---
log_msg("MCP 服务端已启动，等待 stdin...");// 打开标准输入流
$stdin = fopen('php://stdin', 'r');while (true) {// 1. 从 stdin 读取一行$line = fgets($stdin);if ($line === false) {log_msg("Stdin 关闭，正在退出。");break; // STDIN 关闭}log_msg("收到原始请求: " . trim($line));// 2. 解析 JSON$request = json_decode(trim($line), true);if (json_last_error() !== JSON_ERROR_NONE) {log_msg("JSON 解析错误");$response = ["jsonrpc" => "2.0","error" => ["code" => -3, "message" => "无效的 JSON"]];} else {// 3. 处理请求$response = handle_request($request);}// 4. 将 JSON 响应写入 stdout$response_json = json_encode($response);echo $response_json . PHP_EOL; // 必须有换行符fflush(STDOUT); // 必须刷新！
}fclose($stdin);

2. MCP 客户端 (orchestrator.php)

PHP 的进程控制 (proc_open) 比 Python 稍微复杂一些。

<?php// 1. 模拟 AI 模型生成的 "工具调用"
$model_tool_call = ["tool_name" => "file_writer","parameters" => ["path" => "./test_output/demo_php.txt","content" => "Hello from the PHP MCP server!"]
];// 2. (编排器) 转换为 MCP (JSON-RPC) 请求
$mcp_request = ["jsonrpc" => "2.0","id" => 2,"method" => "write_file","params" => $model_tool_call["parameters"]
];echo "[Client] 准备启动 MCP 服务端..." . PHP_EOL;
echo "[Client] 模拟的模型工具调用: " . json_encode($model_tool_call) . PHP_EOL;// 3. 启动 MCP 服务端子进程
$descriptorspec = [0 => ["pipe", "r"],  // stdin 是一个客户端可以写入的管道1 => ["pipe", "w"],  // stdout 是一个客户端可以读取的管道2 => ["pipe", "w"]   // stderr
];$command = 'php mcp_server.php';
$process = proc_open($command, $descriptorspec, $pipes);if (!is_resource($process)) {die("[Client] 错误: 无法启动服务端进程。" . PHP_EOL);
}echo "[Client] 服务端已启动..." . PHP_EOL;// 4. (客户端) 将 JSON 请求写入服务端的 stdin
$request_json = json_encode($mcp_request);
echo "[Client] -> 发送请求到 STDIN: $request_json" . PHP_EOL;
fwrite($pipes[0], $request_json . PHP_EOL); // 必须有换行符
fflush($pipes[0]);// 5. (客户端) 从服务端的 stdout 读取响应
echo "[Client] <- 等待来自 STDOUT 的响应..." . PHP_EOL;
$response_json = fgets($pipes[1]);if ($response_json === false) {echo "[Client] 错误: 未从服务端收到响应。" . PHP_EOL;
} else {echo "[Client] <- 收到原始响应: " . trim($response_json) . PHP_EOL;$response = json_decode(trim($response_json), true);echo "[Client] 解析后的响应: " . json_encode($response) . PHP_EOL;// 6. (编排器) 转换为 "工具结果"if (isset($response["result"])) {$model_tool_result = ["tool_name" => "file_writer","status" => "success","content" => $response["result"]];} else {$model_tool_result = ["tool_name" => "file_writer","status" => "error","content" => $response["error"] ?? null];}echo "[Client] 准备发回给模型的工具结果: " . json_encode($model_tool_result) . PHP_EOL;
}// 7. 清理
fclose($pipes[0]);
fclose($pipes[1]);
$stderr_output = stream_get_contents($pipes[2]);
fclose($pipes[2]);proc_close($process);echo "[Client] 服务端日志:\n$stderr_output" . PHP_EOL;
echo "[Client] 流程结束。" . PHP_EOL;

如何运行：

保存 mcp_server.php 和 orchestrator.php。
确保 mcp_server.php 是可执行的 (或直接用 php mcp_server.php 调用)。
在终端中只运行 php orchestrator.php。

它们如何与模型交互（总结流程）

这个流程图展示了所有组件如何协同工作：

用户

“请帮我把 ‘Hello’ 写入 ‘demo.txt’。”

模型 (Gemini/GPT)

(查看系统提示词中的 file_writer 工具定义)
(生成工具调用) ->

{"tool_name": "file_writer","parameters": { "path": "demo.txt", "content": "Hello" }
}

编排器 (MCP 客户端)
- (收到模型的工具调用)
- (查找 mcpServers 配置，找到 file_writer 工具)
- (执行 python mcp_server.py 启动进程)
- (将模型调用翻译为 MCP/JSON-RPC 请求) ->
```
{ "id": 1, "method": "write_file", "params": {...} }
```
- (通过 stdin 将此 JSON 发送给子进程)
工具 (MCP 服务端)
- (mcp_server.py 从 stdin 读到 JSON)
- (执行 file_put_contents(...))
- (通过 stdout 返回 JSON 响应) ->
```
{ "id": 1, "result": { "status": "success" } }
```
编排器 (MCP 客户端)
- (从子进程的 stdout 读到 JSON 响应)
- (将此响应翻译为模型可读的 “工具结果”) ->
```
{"tool_name": "file_writer","status": "success","content": { "status": "success" }
}
```
- (将此 “工具结果” 发送给模型)
模型 (Gemini/GPT)
- (收到 “工具结果”)
- (生成最终回复) ->
“好的，文件 ‘demo.txt’ 已成功写入。”
用户
- (看到最终回复)