mongodb源码分析session接受客户端find命令过程
mongo/transport/service_state_machine.cpp已经分析startSession创建ASIOSession过程,并且验证connection是否超过限制。
现在继续研究ASIOSession和connection是怎么接受客户端命令的?
mongo/transport/service_state_machine.cpp核心方法有:
enum class State {Created, // The session has been created, but no operations have been performed yetSource, // Request a new Message from the network to handleSourceWait, // Wait for the new Message to arrive from the networkProcess, // Run the Message through the databaseSinkWait, // Wait for the database result to be sent by the networkEndSession, // End the session - the ServiceStateMachine will be invalid after thisEnded // The session has ended. It is illegal to call any method besides// state() if this is the current state.};void _processMessage()
void _sinkCallback()
void _sourceCallback()
void _sourceMessage()
void _sinkMessage()
void _runNextInGuard()
mongo第一条命令状态转变流程是:State::Created 》 State::Source 》State::SourceWait 》 State::Process 》 State::SinkWait 》 State::Source 》State::SourceWait
状态解释:State::Created, //session刚刚创建,但是还没有接受任何命令
State::Source, //去接受客户端新的命令
State::SourceWait, // 等待客户端新的命令
State::Process, // 将接受的命令发送给mongodb数据库
State:: SinkWait, // 等待将命令的执行结果返回给客户端
mongo/transport/service_state_machine.cpp核心方法循环调用的流程图下:
mongo/transport/service_state_machine.cpp核心_runNextInGuard方法主要判断状态State::Source和State::Process,State::Source主要session等待客户端请求find命令,State::Process将命令发送给mongodb数据库, _runNextInGuard代码如下:
void ServiceStateMachine::_runNextInGuard(ThreadGuard guard) {auto curState = state();dassert(curState != State::Ended);// If this is the first run of the SSM, then update its state to Sourceif (curState == State::Created) {curState = State::Source;_state.store(curState);}// Make sure the current Client got set correctlydassert(Client::getCurrent() == _dbClientPtr);try {switch (curState) {case State::Source:LOG(1) << "conca _runNextInGuard State::Source" ;_sourceMessage(std::move(guard));break;case State::Process:LOG(1) << "conca _runNextInGuard State::Process" ;_processMessage(std::move(guard));break;case State::EndSession:LOG(1) << "conca _runNextInGuard State::EndSession" ;_cleanupSession(std::move(guard));break;default:MONGO_UNREACHABLE;}return;} catch (const DBException& e) {log() << "DBException handling request, closing client connection: " << redact(e);}if (!guard) {guard = ThreadGuard(this);}_state.store(State::EndSession);_cleanupSession(std::move(guard));
}
session等待connection请求,状态转变流程:State::Created 》 State::Source 》State::SourceWait,
void ServiceStateMachine::_sourceMessage(ThreadGuard guard) {invariant(_inMessage.empty());invariant(_state.load() == State::Source);LOG(1) << "conca _sourceMessage State::Source";_state.store(State::SourceWait);LOG(1) << "conca _sourceMessage store State::SourceWait";guard.release();auto sourceMsgImpl = [&] {if (_transportMode == transport::Mode::kSynchronous) {MONGO_IDLE_THREAD_BLOCK;return Future<Message>::makeReady(_session()->sourceMessage());} else {invariant(_transportMode == transport::Mode::kAsynchronous);return _session()->asyncSourceMessage();}};sourceMsgImpl().getAsync([this](StatusWith<Message> msg) {if (msg.isOK()) {_inMessage = std::move(msg.getValue());invariant(!_inMessage.empty());}_sourceCallback(msg.getStatus());});
}
_sourceMessage主要处理状态State::Source 》State::SourceWait,等待session接受消息。_session()->asyncSourceMessage()方法session异步等待客户端发送的find命令消息。
如果有find命令到来则调用_sourceCallback(msg.getStatus());_sourceCallback方法代码如下:
void ServiceStateMachine::_sourceCallback(Status status) {// The first thing to do is create a ThreadGuard which will take ownership of the SSM in this// thread.ThreadGuard guard(this);// Make sure we just called sourceMessage();LOG(1) << "conca _sinkMessage State::SinkWait";dassert(state() == State::SourceWait);auto remote = _session()->remote();if (status.isOK()) {_state.store(State::Process);LOG(1) << "conca _sinkMessage store State::Process";// Since we know that we're going to process a message, call scheduleNext() immediately// to schedule the call to processMessage() on the serviceExecutor (or just unwind the// stack)// If this callback doesn't own the ThreadGuard, then we're being called recursively,// and the executor shouldn't start a new thread to process the message - it can use this// one just after this returns.return _scheduleNextWithGuard(std::move(guard),ServiceExecutor::kMayRecurse,transport::ServiceExecutorTaskName::kSSMProcessMessage);} else if (ErrorCodes::isInterruption(status.code()) ||ErrorCodes::isNetworkError(status.code())) {LOG(1) << "Session from " << remote<< " encountered a network error during SourceMessage: " << status;_state.store(State::EndSession);} else if (status == TransportLayer::TicketSessionClosedStatus) {// Our session may have been closed internally.LOG(1) << "Session from " << remote << " was closed internally during SourceMessage";_state.store(State::EndSession);} else {log() << "Error receiving request from client: " << status << ". Ending connection from "<< remote << " (connection id: " << _session()->id() << ")";_state.store(State::EndSession);}// There was an error receiving a message from the client and we've already printed the error// so call runNextInGuard() to clean up the session without waiting._runNextInGuard(std::move(guard));
}
收到find命令转给mongodb执行find命令,状态转变:State::SourceWait》 State::Process,继续调用_scheduleNextWithGuard 》 schedule 调度 》 _runNextInGuard(上面已经存在,反复调用这个方法)。
现在_runNextInGuard方法里面状态State::Process,所以继续调用的方法是_processMessage执行mongodb数据库命令,接受mongodb数据库返回的数据,_processMessage详细代码如下:
void ServiceStateMachine::_processMessage(ThreadGuard guard) {invariant(!_inMessage.empty());LOG(1) << "conca _processMessage";TrafficRecorder::get(_serviceContext).observe(_sessionHandle, _serviceContext->getPreciseClockSource()->now(), _inMessage);auto& compressorMgr = MessageCompressorManager::forSession(_session());_compressorId = boost::none;if (_inMessage.operation() == dbCompressed) {MessageCompressorId compressorId;auto swm = compressorMgr.decompressMessage(_inMessage, &compressorId);uassertStatusOK(swm.getStatus());_inMessage = swm.getValue();_compressorId = compressorId;}networkCounter.hitLogicalIn(_inMessage.size());// Pass sourced Message to handler to generate response.auto opCtx = Client::getCurrent()->makeOperationContext();// The handleRequest is implemented in a subclass for mongod/mongos and actually all the// database work for this request.DbResponse dbresponse = _sep->handleRequest(opCtx.get(), _inMessage);// opCtx must be destroyed here so that the operation cannot show// up in currentOp results after the response reaches the clientopCtx.reset();// Format our response, if we have oneMessage& toSink = dbresponse.response;if (!toSink.empty()) {invariant(!OpMsg::isFlagSet(_inMessage, OpMsg::kMoreToCome));invariant(!OpMsg::isFlagSet(toSink, OpMsg::kChecksumPresent));// Update the header for the response message.toSink.header().setId(nextMessageId());toSink.header().setResponseToMsgId(_inMessage.header().getId());if (OpMsg::isFlagSet(_inMessage, OpMsg::kChecksumPresent)) {
#ifdef MONGO_CONFIG_SSLif (!SSLPeerInfo::forSession(_session()).isTLS) {OpMsg::appendChecksum(&toSink);}
#elseOpMsg::appendChecksum(&toSink);
#endif}// If the incoming message has the exhaust flag set and is a 'getMore' command, then we// bypass the normal RPC behavior. We will sink the response to the network, but we also// synthesize a new 'getMore' request, as if we sourced a new message from the network. This// new request is sent to the database once again to be processed. This cycle repeats as// long as the associated cursor is not exhausted. Once it is exhausted, we will send a// final response, terminating the exhaust stream._inMessage = makeExhaustMessage(_inMessage, &dbresponse);_inExhaust = !_inMessage.empty();networkCounter.hitLogicalOut(toSink.size());if (_compressorId) {auto swm = compressorMgr.compressMessage(toSink, &_compressorId.value());uassertStatusOK(swm.getStatus());toSink = swm.getValue();}TrafficRecorder::get(_serviceContext).observe(_sessionHandle, _serviceContext->getPreciseClockSource()->now(), toSink);_sinkMessage(std::move(guard), std::move(toSink));LOG(1) << "conca _processMessage _sinkMessage";} else {_state.store(State::Source);_inMessage.reset();LOG(1) << "conca _processMessage store(State::Source)";return _scheduleNextWithGuard(std::move(guard),ServiceExecutor::kDeferredTask,transport::ServiceExecutorTaskName::kSSMSourceMessage);}
}
DbResponse dbresponse = _sep->handleRequest(opCtx.get(), _inMessage)将接受的find命令发送给mongodb数据库,mongodb数据库执行逻辑,返回响应结果。
最后调用_sinkMessage,将mongodb结果数据发送给客户端。状态转变流程:State::Process 》 State::SinkWait
void ServiceStateMachine::_sinkMessage(ThreadGuard guard, Message toSink) {// Sink our response to the clientinvariant(_state.load() == State::Process);LOG(1) << "conca _sinkMessage State::Source";_state.store(State::SinkWait);LOG(1) << "conca _sinkMessage store State::SinkWait";guard.release();auto sinkMsgImpl = [&] {if (_transportMode == transport::Mode::kSynchronous) {// We don't consider ourselves idle while sending the reply since we are still doing// work on behalf of the client. Contrast that with sourceMessage() where we are waiting// for the client to send us more work to do.return Future<void>::makeReady(_session()->sinkMessage(std::move(toSink)));} else {invariant(_transportMode == transport::Mode::kAsynchronous);return _session()->asyncSinkMessage(std::move(toSink));}};sinkMsgImpl().getAsync([this](Status status) { _sinkCallback(std::move(status)); });
}
_session()->asyncSinkMessage(std::move(toSink))发送结果给客户端,成功之后继续调用_sinkCallback。
void ServiceStateMachine::_sinkCallback(Status status) {// The first thing to do is create a ThreadGuard which will take ownership of the SSM in this// thread.ThreadGuard guard(this);LOG(1) << "conca _sinkCallback State::SinkWait";dassert(state() == State::SinkWait);// If there was an error sinking the message to the client, then we should print an error and// end the session. No need to unwind the stack, so this will runNextInGuard() and return.//// Otherwise, update the current state depending on whether we're in exhaust or not, and call// scheduleNext() to unwind the stack and do the next step.if (!status.isOK()) {log() << "Error sending response to client: " << status << ". Ending connection from "<< _session()->remote() << " (connection id: " << _session()->id() << ")";_state.store(State::EndSession);return _runNextInGuard(std::move(guard));} else if (_inExhaust) {_state.store(State::Process);LOG(1) << "conca _sinkCallback store State::Process";return _scheduleNextWithGuard(std::move(guard),ServiceExecutor::kDeferredTask |ServiceExecutor::kMayYieldBeforeSchedule,transport::ServiceExecutorTaskName::kSSMExhaustMessage);} else {_state.store(State::Source);LOG(1) << "conca _sinkCallback store State::Source";return _scheduleNextWithGuard(std::move(guard),ServiceExecutor::kDeferredTask |ServiceExecutor::kMayYieldBeforeSchedule,transport::ServiceExecutorTaskName::kSSMSourceMessage);}
}
session发完消息之后, State::SinkWait 》 State::Source 到此位置这个find命令已经完成,_scheduleNextWithGuard后面继续等待新的命令到来。
上面find命令打印日志如下:
2025-04-24T21:24:02.580+0800 D1 NETWORK [conn1] conca _sourceMessage State::Source
2025-04-24T21:24:02.581+0800 D1 NETWORK [conn1] conca _sourceMessage store State::SourceWait
2025-04-24T21:24:09.957+0800 D1 NETWORK [conn1] conca _sinkMessage State::SinkWait
2025-04-24T21:24:09.957+0800 D1 NETWORK [conn1] conca _sinkMessage store State::Process
2025-04-24T21:24:09.958+0800 D1 NETWORK [conn1] conca func end
2025-04-24T21:24:09.961+0800 D1 NETWORK [conn1] conca guard.release() end
2025-04-24T21:24:09.962+0800 D1 NETWORK [conn1] conca _runNextInGuard State::Process
2025-04-24T21:24:09.963+0800 D1 NETWORK [conn1] conca _processMessage
2025-04-24T21:24:09.964+0800 D1 NETWORK [conn1] conca _processMessage _sep->handleRequest
2025-04-24T21:24:09.965+0800 I SHARDING [conn1] Marking collection db.user as collection version: <unsharded>
2025-04-24T21:24:09.969+0800 I COMMAND [conn1] command db.user appName: "MongoDB Shell" command: find { find: "user", filter: {}, lsid: { id: UUID("7ae50c73-fcc6-4d15-8a80-7f2bf2192e0f") }, $db: "db" } planSummary: COLLSCAN keysExamined:0 docsExamined:6 cursorExhausted:1 numYields:0 nreturned:6 reslen:421 locks:{ ReplicationStateTransition: { acquireCount: { w: 1 } }, Global: { acquireCount: { r: 1 } }, Database: { acquireCount: { r: 1 } }, Collection: { acquireCount: { r: 1 } }, Mutex: { acquireCount: { r: 1 } } } storage:{ data: { bytesRead: 416 } } protocol:op_msg 3ms
2025-04-24T21:24:09.975+0800 D1 NETWORK [conn1] conca _processMessage _sep->handleRequest dbresponse
2025-04-24T21:24:09.978+0800 D1 NETWORK [conn1] conca _processMessage !toSink.empty()
2025-04-24T21:24:09.978+0800 D1 NETWORK [conn1] conca _processMessage makeExhaustMessage
2025-04-24T21:24:09.979+0800 D1 NETWORK [conn1] conca _processMessage TrafficRecorder::get(_serviceContext) .observe
2025-04-24T21:24:09.980+0800 D1 NETWORK [conn1] conca _processMessage _sinkMessage
2025-04-24T21:24:09.981+0800 D1 NETWORK [conn1] conca _sinkMessage State::Source
2025-04-24T21:24:09.986+0800 D1 NETWORK [conn1] conca _sinkMessage store State::SinkWait
2025-04-24T21:24:09.986+0800 D1 NETWORK [conn1] conca _sinkCallback State::SinkWait
2025-04-24T21:24:09.987+0800 D1 NETWORK [conn1] conca _sinkCallback store State::Source
2025-04-24T21:24:09.988+0800 D1 NETWORK [conn1] conca func end
2025-04-24T21:24:09.989+0800 D1 NETWORK [conn1] conca guard.release() end
2025-04-24T21:24:09.990+0800 D1 NETWORK [conn1] conca_serviceExecutor->schedule(std::move(func), flags, taskName)
2025-04-24T21:24:09.991+0800 D1 NETWORK [conn1] conca_serviceExecutor->schedule(std::move(func), flags, taskName)
2025-04-24T21:24:09.992+0800 D1 NETWORK [conn1] conca _runNextInGuard State::Source
2025-04-24T21:24:09.993+0800 D1 NETWORK [conn1] conca _sourceMessage State::Source
2025-04-24T21:24:09.994+0800 D1 NETWORK [conn1] conca _sourceMessage store State::SourceWait
总结:
mongo第一条命令流程是:State::Created 》 State::Source 》State::SourceWait 》 State::Process 》 State::SinkWait 》 State::Source 》State::SourceWait
mongo第二条命令流程是: State::Process 》 State::SinkWait 》 State::Source 》State::SourceWait
mongo第三条命令流程是: State::Process 》 State::SinkWait 》 State::Source 》State::SourceWait