DETR3D- 3D Object Detection from Multi-view Images via 3D-to-2D Queries
MIT CORL 2021
纯视觉BEV方案transformer网络3D检测
paper:[2110.06922] DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
code:GitHub - WangYueFt/detr3d
- DNN提图像特征,FPN提多尺度特征
-
pts_bbox_head Detr3DHead
- transformer Detr3DTransformer
-
Detr3DHead__init__self.query_embedding = nn.Embedding(self.num_query, self.embed_dims * 2)forwardquery_embeds = self.query_embedding.weighths, init_reference, inter_references = self.transformer(mlvl_feats,query_embeds,reg_branches=self.reg_branches if self.with_box_refine else None, # noqa:E501img_metas=img_metas,)Detr3DTransformer__init__self.embed_dims = self.decoder.embed_dimsself.reference_points = nn.Linear(self.embed_dims, 3)forward(self, mlvl_feats, query_embed, reg_branches=None, **kwargs):query_pos, query = torch.split(query_embed, self.embed_dims , dim=1)query_pos = query_pos.unsqueeze(0).expand(bs, -1, -1)reference_points = self.reference_points(query_pos).sigmoid()
-
Detr3DCrossAtten
-
MultiheadAttention
-
-
bbox_coder NMSFreeCoder
-
loss_cls FocalLoss
- transformer Detr3DTransformer