YOLOv10改进|爆改模型|涨点|在颈部网络添加结合部分卷积PConv和SDI融合方法的PSDI特征融合层(附代码+修改教程)
一、文本介绍
本文修改的模型是YOLOv10,YOLOv10无需非极大值抑制(NMS)进行后处理,其推理速度以及参数量上都优于现有的模型。然而,针对某些目标检测任务中需要同时处理多尺度目标的挑战,YOLOv10 在此类场景下的表现仍存在一定局限性。为此,本文在 YOLOv10 的骨干网络后引入了 PSDI(Partial Convolution-based Semantic Decoupled Integration)特征融合模块。PSDI 首先采用部分卷积(Partial Convolution, PConv)对来自不同层级的特征图(记为 F1、F2 和 F3)进行卷积处理,以获得通道数一致的特征表示,该过程与传统 SDI 模块的处理方式类似。与标准卷积相比,PConv 能够在特征提取过程中有选择性地忽略无效或缺失的信息,从而更有效地从底层冗余信息丰富的特征图中提取关键特征。这一设计有助于提升模型对不同尺度目标的感知与检测能力。
YOLOv10论文地址:https://arxiv.org/pdf/2405.14458
PConv卷积模块论文:https://arxiv.org/pdf/2303.03667
SDI模块论文:https://arxiv.org/pdf/2311.17791v2
二、模型图
模型架构(改进为红色箭头标注)
PSDI模块:
PConv卷积模块:
SDI模块:
三、核心代码
在block.py中追加PSDI模块的代码定义
PSDI模块的具体代码如下:
class PSDI(nn.Module):def __init__(self, channel):super().__init__()self.pccov1 = PConv(64, channel)self.pccov2 = PConv(128, channel)self.pccov3 = PConv(256, channel)self.convs = nn.ModuleList([nn.Conv2d(channel, channel, kernel_size=3, stride=1, padding=1) for _ in range(4)])def forward(self, xs):xs[0] = self.pccov1(xs[0])xs[1] = self.pccov2(xs[1])xs[2] = self.pccov3(xs[2])anchor = xs[-1]ans = torch.ones_like(anchor)target_size_h = anchor.shape[2]target_size_w = anchor.shape[3]for i, x in enumerate(xs):x = F.interpolate(x, size=(target_size_h, target_size_w),mode='bilinear', align_corners=True)ans = ans * self.convs[i](x)return ans
其中,PConv的代码如下(同样放在block.py中,这里也是可以尝试其他卷积模块来替换PConv优化模型):
class PConv(nn.Module):def __init__(self, dim, ouc, n_div=4, forward='split_cat'):super().__init__()self.dim_conv3 = dim // n_divself.dim_untouched = dim - self.dim_conv3self.partial_conv3 = nn.Conv2d(self.dim_conv3, self.dim_conv3, 3, 1, 1, bias=False)self.conv = Conv(dim, ouc, k=1)if forward == 'slicing':self.forward = self.forward_slicingelif forward == 'split_cat':self.forward = self.forward_split_catelse:raise NotImplementedErrordef forward_slicing(self, x):# only for inferencex = x.clone() # !!! Keep the original input intact for the residual connection laterx[:, :self.dim_conv3, :, :] = self.partial_conv3(x[:, :self.dim_conv3, :, :])x = self.conv(x)return xdef forward_split_cat(self, x):# for training/inferencex1, x2 = torch.split(x, [self.dim_conv3, self.dim_untouched], dim=1)x1 = self.partial_conv3(x1)x = torch.cat((x1, x2), 1)x = self.conv(x)return x
在block.py和__init__.py的__all__中天健PSDI模块
修改tasks.py,加入PSDI模块通道数定义
配置yaml文件,以yolov10n.yaml为基准(注意修改number of classes)
# Parameters
nc: * # number of classes
scales: # model compound scaling constants, i.e. 'model=yolov8n.yaml' will call yolov8.yaml with scale 'n'# [depth, width, max_channels]n: [0.33, 0.25, 1024] # YOLOv8.0n backbone
backbone:# [from, repeats, module, args]- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2- [-1, 1, Conv, [128, 3, 2]] # 1-P2/4- [-1, 3, C2f, [128, True]]- [-1, 1, Conv, [256, 3, 2]] # 3-P3/8- [-1, 6, C2f, [256, True]]- [-1, 1, SCDown, [512, 3, 2]] # 5-P4/16- [-1, 6, C2f, [512, True]]- [-1, 1, SCDown, [1024, 3, 2]] # 7-P5/32- [-1, 3, C2f, [1024, True]]- [-1, 1, SPPF, [1024, 5]] # 9- [-1, 1, PSA, [1024]] # 10# YOLOv8.0n head
head:- [[4, 6, 10, 10], 1, SDI, [256]] #11- [-1, 1, nn.Upsample, [None, 2, "nearest"]]- [[-1, 6], 1, Concat, [1]] # cat backbone P4- [-1, 3, C2f, [512]] # 14- [-1, 1, nn.Upsample, [None, 2, "nearest"]]- [[-1, 4], 1, Concat, [1]] # cat backbone P3- [-1, 3, C2f, [256]] # 17 (P3/8-small)- [-1, 1, Conv, [256, 3, 2]]- [[-1, 14], 1, Concat, [1]] # cat head P4- [-1, 3, C2f, [512]] # 20 (P4/16-medium)- [-1, 1, SCDown, [512, 3, 2]]- [[-1, 11], 1, Concat, [1]] # cat head P5- [-1, 3, C2fCIB, [1024, True, True]] # 23 (P5/32-large)- [[17, 20, 23], 1, v10Detect, [nc]] # Detect(P3, P4, P5)