当前位置：首页 > news >正文

Netty 解码器 DelimiterBasedFrameDecoder

news 2025/9/24 8:18:19

`DelimiterBasedFrameDecoder`

DelimiterBasedFrameDecoder 是一个通用的“粘包/半包”处理器，它通过用户指定的一个或多个分隔符来切分 ByteBuf 数据流。

与LineBasedFrameDecoder 相比，DelimiterBasedFrameDecoder 更加通用。LineBasedFrameDecoder 只能处理 \n 和 \r\n，而 DelimiterBasedFrameDecoder 可以处理任意字节序列作为分隔符，例如 \0 (NUL character)、$$、或者其他任何自定义的字节组合。

/** Copyright 2012 The Netty Project** The Netty Project licenses this file to you under the Apache License,* version 2.0 (the "License"); you may not use this file except in compliance* with the License. You may obtain a copy of the License at:**   https://www.apache.org/licenses/LICENSE-2.0** Unless required by applicable law or agreed to in writing, software* distributed under the License is distributed on an "AS IS" BASIS, WITHOUT* WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the* License for the specific language governing permissions and limitations* under the License.*/
package io.netty.handler.codec;// ... existing code .../*** A decoder that splits the received {@link ByteBuf}s by one or more* delimiters.  It is particularly useful for decoding the frames which ends* with a delimiter such as {@link Delimiters#nulDelimiter() NUL} or* {@linkplain Delimiters#lineDelimiter() newline characters}.** <h3>Predefined delimiters</h3>* <p>* {@link Delimiters} defines frequently used delimiters for convenience' sake.** <h3>Specifying more than one delimiter</h3>* <p>* {@link DelimiterBasedFrameDecoder} allows you to specify more than one* delimiter.  If more than one delimiter is found in the buffer, it chooses* the delimiter which produces the shortest frame.  For example, if you have* the following data in the buffer:* <pre>* +--------------+* | ABC\nDEF\r\n |* +--------------+* </pre>* a {@link DelimiterBasedFrameDecoder}({@link Delimiters#lineDelimiter() Delimiters.lineDelimiter()})* will choose {@code '\n'} as the first delimiter and produce two frames:* <pre>* +-----+-----+* | ABC | DEF |* +-----+-----+* </pre>* rather than incorrectly choosing {@code '\r\n'} as the first delimiter:* <pre>* +----------+* | ABC\nDEF |* +----------+* </pre>*/
public class DelimiterBasedFrameDecoder extends ByteToMessageDecoder {
// ... existing code ...

从类注释中我们可以提炼出两个关键特性：

通用性：可以处理任意自定义分隔符。Netty 在 Delimiters 类中预定义了一些常见的分隔符，如 nulDelimiter() 和 lineDelimiter()。
多分隔符处理策略：当提供多个分隔符时，解码器会选择那个能产生最短帧的分隔符。如 Javadoc 中的例子所示，对于 ABC\nDEF\r\n，如果分隔符是 \n 和 \r\n，解码器会先匹配到 \n，将 ABC 作为第一个帧，而不是等待匹配 \r\n。

关键属性（字段）

// ... existing code ...
public class DelimiterBasedFrameDecoder extends ByteToMessageDecoder {private final ByteBuf[] delimiters;private final int maxFrameLength;private final boolean stripDelimiter;private final boolean failFast;private boolean discardingTooLongFrame;private int tooLongFrameLength;/** Set only when decoding with "\n" and "\r\n" as the delimiter.  */private final LineBasedFrameDecoder lineBasedDecoder;
// ... existing code ...

delimiters: ByteBuf 数组，存储了所有用户定义的分隔符。
maxFrameLength, stripDelimiter, failFast: 这三个参数与 LineBasedFrameDecoder 中的作用完全相同，分别控制最大帧长、是否移除分隔符、以及是否快速失败。
discardingTooLongFrame, tooLongFrameLength: 这两个字段用于处理超长帧，功能上等同于 LineBasedFrameDecoder 中的 discarding 和 discardedBytes。
lineBasedDecoder: 这是一个非常重要的优化字段。如果解码器在初始化时发现用户提供的分隔符恰好是 \n 和 \r\n，它不会自己去处理，而是会创建一个 LineBasedFrameDecoder 实例，并将后续所有的解码工作委托给它。因为 LineBasedFrameDecoder 是为行解码专门优化的，性能更高。

构造函数与初始化

DelimiterBasedFrameDecoder 提供了一系列重载的构造函数，但它们最终都会调用下面这个最全的构造函数。

// ... existing code ...public DelimiterBasedFrameDecoder(int maxFrameLength, boolean stripDelimiter, boolean failFast, ByteBuf... delimiters) {validateMaxFrameLength(maxFrameLength);ObjectUtil.checkNonEmpty(delimiters, "delimiters");if (isLineBased(delimiters) && !isSubclass()) {lineBasedDecoder = new LineBasedFrameDecoder(maxFrameLength, stripDelimiter, failFast);this.delimiters = null;} else {this.delimiters = new ByteBuf[delimiters.length];for (int i = 0; i < delimiters.length; i ++) {ByteBuf d = delimiters[i];validateDelimiter(d);this.delimiters[i] = d.slice(d.readerIndex(), d.readableBytes());}lineBasedDecoder = null;}this.maxFrameLength = maxFrameLength;this.stripDelimiter = stripDelimiter;this.failFast = failFast;}
// ... existing code ...

初始化的核心逻辑在于 if (isLineBased(delimiters) && !isSubclass()) 这个判断：

isLineBased(delimiters): 这个私有静态方法会检查传入的 delimiters 数组是否恰好是 \n 和 \r\n 这两个分隔符。

// ... existing code ...
private static boolean isLineBased(final ByteBuf[] delimiters) {if (delimiters.length != 2) {return false;}ByteBuf a = delimiters[0];ByteBuf b = delimiters[1];if (a.capacity() < b.capacity()) {a = delimiters[1];b = delimiters[0];}return a.capacity() == 2 && b.capacity() == 1&& a.getByte(0) == '\r' && a.getByte(1) == '\n'&& b.getByte(0) == '\n';
}
// ... existing code ...

!isSubclass(): 这个检查是为了确保只有 DelimiterBasedFrameDecoder 本身才会触发这个优化。如果用户创建了一个 DelimiterBasedFrameDecoder 的子类并重写了 decode 方法，那么这个优化就不应该被触发，以防破坏子类的逻辑。
如果条件满足，就 new 一个 LineBasedFrameDecoder 并赋值给 lineBasedDecoder 字段。
否则，就正常地将用户传入的分隔符保存到 delimiters 数组中。

核心解码逻辑 `decode`

decode 方法是解码工作的核心。

// ... existing code ...protected Object decode(ChannelHandlerContext ctx, ByteBuf buffer) throws Exception {if (lineBasedDecoder != null) {return lineBasedDecoder.decode(ctx, buffer);}// Try all delimiters and choose the delimiter which yields the shortest frame.int minFrameLength = Integer.MAX_VALUE;ByteBuf minDelim = null;for (ByteBuf delim: delimiters) {int frameLength = indexOf(buffer, delim);if (frameLength >= 0 && frameLength < minFrameLength) {minFrameLength = frameLength;minDelim = delim;}}if (minDelim != null) {// ... 找到分隔符，处理帧 ...} else {// ... 未找到分隔符，处理半包或超长帧 ...}}
// ... existing code ...

解码逻辑可以分为三个主要部分：

委托解码（优化路径）:
- 方法的第一行代码就是 if (lineBasedDecoder != null)。如果构造时触发了行解码优化，那么 decode 方法的所有调用都会直接转发给 lineBasedDecoder.decode()，后续的逻辑都不会执行。
通用解码 - 查找最短帧:
- 如果不是行解码场景，代码会遍历 delimiters 数组。
- 对于每一个分隔符 delim，它调用 indexOf(buffer, delim) 在当前缓冲区中查找该分隔符。indexOf 返回的是从 readerIndex 到分隔符起始位置的字节数。
- 循环的目的是找到一个存在于 buffer 中 (frameLength >= 0) 且能产生最短帧 (frameLength < minFrameLength) 的分隔符 minDelim。
通用解码 - 处理结果:
- 情况一：找到分隔符 (minDelim != null)
  - 首先检查是否正处于 discardingTooLongFrame 状态。如果是，说明一个超长帧刚刚结束，现在需要重置状态、跳过数据并根据 failFast 决定是否抛异常。
  - 然后，检查找到的帧长 minFrameLength 是否大于 maxFrameLength。如果大于，则直接跳过这个超长帧并抛出 TooLongFrameException。
  - 如果帧长合法，则根据 stripDelimiter 的值，使用 readRetainedSlice 提取出有效的数据帧并返回。
- 情况二：未找到分隔符 (minDelim == null)
  - 这意味着当前缓冲区中没有一个完整的数据帧。
  - 此时检查当前缓冲区的可读字节数是否已经超过 maxFrameLength。
  - 如果超过，并且之前不是丢弃状态，则进入丢弃状态 (discardingTooLongFrame = true)，记录并丢弃当前所有数据。如果 failFast 为 true，立即抛出异常。
  - 如果已经处于丢弃状态，则继续累加并丢弃新到达的数据。
  - 如果未超过 maxFrameLength，则说明是正常的“半包”情况，返回 null，等待更多数据到达。

总结

DelimiterBasedFrameDecoder 是 Netty 解码器家族中非常重要的一员。

优点:
- 高度灵活：支持任意字节序列作为分隔符，可以应对各种私有协议。
- 支持多分隔符：能够处理存在多种结束标志的复杂协议。
- 智能优化：对最常见的行分隔场景（\n 和 \r\n）自动切换到性能更高的 LineBasedFrameDecoder。
- 健壮性：内置了对超长帧的处理，防止内存溢出。
使用场景:
- 任何使用特定字符或字符串作为消息边界的协议。例如，处理以 \0 结尾的 C-Style 字符串流，或者某些日志、金融等领域使用的自定义文本协议。
- 当需要同时兼容 \n 和 \r\n 作为换行符时，可以直接使用 new DelimiterBasedFrameDecoder(MAX_LEN, Delimiters.lineDelimiter())。
与其它解码器的关系:
- 它是 LineBasedFrameDecoder 的通用化版本。
- 与 FixedLengthFrameDecoder（固定长度分帧）和 LengthFieldBasedFrameDecoder（基于长度字段分帧）共同构成了 Netty 解决“粘包/半包”问题的主要工具集，分别适用于不同类型的协议。

`LineBasedFrameDecoder`

LineBasedFrameDecoder 是 Netty 中一个非常实用的解码器，专门用于处理以换行符为分隔符的数据流。在许多基于文本的协议（如 SMTP、FTP、Redis 的 RESP 等）中，消息都是以行为单位进行传输的，这个解码器能极大地简化对这类协议的解析。

LineBasedFrameDecoder 的核心作用是解决 TCP 传输中的“粘包”和“半包”问题。它继承自 ByteToMessageDecoder，通过扫描传入的 ByteBuf，找到行尾分隔符（\n 或 \r\n），然后将分隔符之前的数据作为一个完整的“帧” (Frame) 提取出来，传递给 Pipeline 中的下一个 Handler。

ByteToMessageDecoder分析见：Netty ByteToMessageDecoder解码机制解析

// ... existing code ...
/*** A decoder that splits the received {@link ByteBuf}s on line endings.* <p>* Both {@code "\n"} and {@code "\r\n"} are handled.* <p>* The byte stream is expected to be in UTF-8 character encoding or ASCII. The current implementation* uses direct {@code byte} to {@code char} cast and then compares that {@code char} to a few low range* ASCII characters like {@code '\n'} or {@code '\r'}. UTF-8 is not using low range [0..0x7F]* byte values for multibyte codepoint representations therefore fully supported by this implementation.* <p>* For a more general delimiter-based decoder, see {@link DelimiterBasedFrameDecoder}.* <p>* Users should be aware that used as is, the lenient approach on lone {@code '\n} might result on a parser* diffenrencial on line based protocols requiring the use of {@code "\r\n"} delimiters like SMTP and can* result in attacks similar to* <a href="https://sec-consult.com/blog/detail/smtp-smuggling-spoofing-e-mails-worldwide/">SMTP smuggling</a>.* Validating afterward the end of line pattern can be a possible mitigation.*/
public class LineBasedFrameDecoder extends ByteToMessageDecoder {
// ... existing code ...

从注释中我们可以看到：

它能同时处理 \n 和 \r\n 两种换行符。
它是一个特殊、优化的 DelimiterBasedFrameDecoder。实际上，如果你使用 DelimiterBasedFrameDecoder 并将分隔符设置为 \n 和 \r\n，它内部会直接创建一个 LineBasedFrameDecoder 来处理，以获得更好的性能。
它也指出了一个安全风险（SMTP Smuggling），因为该解码器对 \n 的处理比较宽松，如果后端协议严格要求 \r\n，可能会导致解析差异，从而产生安全漏洞。

关键属性（字段）

LineBasedFrameDecoder 的行为由几个关键的 final 字段在构造时确定。

// ... existing code ...
public class LineBasedFrameDecoder extends ByteToMessageDecoder {/** Maximum length of a frame we're willing to decode.  */private final int maxLength;/** Whether or not to throw an exception as soon as we exceed maxLength. */private final boolean failFast;private final boolean stripDelimiter;/** True if we're discarding input because we're already over maxLength.  */private boolean discarding;private int discardedBytes;/** Last scan position. */private int offset;
// ... existing code ...

maxLength: 定义了单行数据的最大长度。如果一行数据（不包括换行符）的长度超过这个值，解码器会抛出 TooLongFrameException。这是防止恶意攻击或内存溢出的重要保护机制。
stripDelimiter: 一个布尔值，决定了解码出的数据帧（ByteBuf）是否包含结尾的换行符。true 表示移除换行符，false 表示保留。
failFast: 决定了何时抛出 TooLongFrameException。
- true（快速失败）: 一旦解码器检测到当前累积的数据长度已超过 maxLength，即使还没遇到换行符，也会立刻抛出异常。
- false: 解码器会继续接收数据，直到找到一个完整的超长行（即读到换行符后），才计算总长度并抛出异常。在此期间，所有属于这个超长行的数据都会被丢弃。
discarding: 一个状态标志。当检测到超长帧时，此标志会变为 true。在此状态下，解码器会丢弃所有后续的字节，直到找到行尾分隔符，标志才会变回 false。
discardedBytes: 配合 failFast=false 使用，用于记录在 discarding 状态下已经丢弃的字节数，以便在最终抛出异常时能报告准确的超长帧长度。
offset: 一个优化字段。它记录了上一次扫描结束的位置。当新的数据块到来时，findEndOfLine 方法可以从 readerIndex + offset 开始搜索，避免了对已经扫描过的、没有换行符的区域进行重复搜索。

构造函数

LineBasedFrameDecoder 提供了几个构造函数来初始化这些关键属性。

// ... existing code ...public LineBasedFrameDecoder(final int maxLength) {this(maxLength, true, false);}/*** Creates a new decoder.* @param maxLength  the maximum length of the decoded frame.*                   A {@link TooLongFrameException} is thrown if*                   the length of the frame exceeds this value.* @param stripDelimiter  whether the decoded frame should strip out the*                        delimiter or not* @param failFast  If <tt>true</tt>, a {@link TooLongFrameException} is*                  thrown as soon as the decoder notices the length of the*                  frame will exceed <tt>maxFrameLength</tt> regardless of*                  whether the entire frame has been read.*                  If <tt>false</tt>, a {@link TooLongFrameException} is*                  thrown after the entire frame that exceeds*                  <tt>maxFrameLength</tt> has been read.*/public LineBasedFrameDecoder(final int maxLength, final boolean stripDelimiter, final boolean failFast) {this.maxLength = maxLength;this.failFast = failFast;this.stripDelimiter = stripDelimiter;}
// ... existing code ...

最常用的构造函数是 LineBasedFrameDecoder(maxLength)，它默认会剥离换行符 (stripDelimiter=true) 并且不采用快速失败模式 (failFast=false)。

核心解码逻辑 `decode`

decode 方法是整个解码器的核心，其逻辑可以分为两大分支：正常处理 (!discarding) 和丢弃模式 (discarding)。

// ... existing code ...protected Object decode(ChannelHandlerContext ctx, ByteBuf buffer) throws Exception {final int eol = findEndOfLine(buffer);if (!discarding) {if (eol >= 0) {// ... 找到换行符，正常处理 ...} else {// ... 未找到换行符，检查是否超长 ...}} else {if (eol >= 0) {// ... 丢弃模式下找到换行符，结束丢弃 ...} else {// ... 丢弃模式下未找到换行符，继续丢弃 ...}return null;}}
// ... existing code ...

查找换行符 `findEndOfLine`

解码的第一步是调用 findEndOfLine 查找换行符。

// ... existing code ...private int findEndOfLine(final ByteBuf buffer) {int totalLength = buffer.readableBytes();int i = buffer.indexOf(buffer.readerIndex() + offset,buffer.readerIndex() + totalLength, (byte) '\n');if (i >= 0) {offset = 0;if (i > 0 && buffer.getByte(i - 1) == '\r') {i--;}} else {offset = totalLength;}return i;}
}

它使用 buffer.indexOf 从 readerIndex + offset 开始查找字节 \n。
如果找到了 (i >= 0)，它会检查前一个字节是否是 \r。如果是，则将索引 i 减一，这样 i 就指向了 \r\n 分隔符的起始位置。然后重置 offset 为 0，为下一次解码做准备。
如果没找到 (i < 0)，说明当前可读字节中没有完整的行。它将 offset 更新为当前可读字节数 totalLength，然后返回 -1。

正常处理 (`!discarding`)

找到换行符 (eol >= 0):
1. 计算行的长度 length 和分隔符的长度 delimLength (1 for \n, 2 for \r\n)。
2. 长度检查: 如果 length > maxLength，说明这是一个超长帧。此时，直接跳过这个超长帧（包括分隔符），然后调用 fail() 方法抛出异常，并返回 null。
3. 提取帧: 如果长度合法，根据 stripDelimiter 的值决定如何提取帧。它使用 readRetainedSlice() 方法，这是一个零拷贝操作，返回的 ByteBuf 与原始 buffer 共享内存，但有独立的读写指针，非常高效。最后返回提取出的帧。
未找到换行符 (eol < 0):
1. 说明当前 ByteBuf 中没有完整的行。
2. 长度检查: 检查当前已累积的字节数 buffer.readableBytes() 是否已超过 maxLength。
3. 如果超过，说明此帧注定超长。进入丢弃模式：设置 discarding = true，记录已丢弃字节数，并清空缓冲区。如果 failFast 为 true，立即抛出异常。
4. 如果未超过，则返回 null，等待更多数据到达。

丢弃模式 (`discarding`)

找到换行符 (eol >= 0):
1. 这标志着超长帧的结束。
2. 更新 readerIndex 跳过这个超长帧的剩余部分和分隔符。
3. 重置状态：discarding = false, discardedBytes = 0。
4. 如果 failFast 为 false，此时才调用 fail() 方法抛出异常。
5. 返回 null。
未找到换行符 (eol < 0):
1. 说明超长帧还未结束。
2. 将当前缓冲区的所有字节都累加到 discardedBytes 中，并清空缓冲区。
3. 返回 null，继续等待并丢弃后续数据。

总结

LineBasedFrameDecoder 是一个设计精巧、高效且功能明确的解码器。

优点:
- 专门为行分隔协议优化，性能高。
- API 简单，只需提供最大行长即可使用。
- 通过 maxLength, failFast 等参数提供了强大的保护和灵活性。
- 内部实现利用零拷贝等技术，非常高效。
使用场景:
- 当你需要处理任何以 \n 或 \r\n 分隔的文本协议时，它都是首选。
- 通常和 StringDecoder 配合使用，LineBasedFrameDecoder 负责分帧，StringDecoder 负责将 ByteBuf 帧转换为字符串。
注意事项:
- 务必设置一个合理的 maxLength，防止因客户端发送超长数据而耗尽内存。
- 了解 failFast 和 stripDelimiter 的含义，根据业务需求进行配置。
- 注意其对 \n 的宽松处理可能带来的安全隐患，如果协议严格要求 \r\n，可能需要在下游 Handler 中增加额外的校验。