关于yolov5 v2.0本地运行出现 的一些问题的解决
1.UnicodeDecodeError: 'gbk' codec can't decode byte 0x80 in position 176: illegal multibyte sequence
这个错误是由于 yaml
模块在读取 origincar.yaml
文件时尝试使用默认的编码(例如 GBK
),但文件包含了非法的多字节序列。为了解决这个问题,可以显式地指定文件编码为 utf-8
。
解决方法就是将yaml文件中的注释都去掉,然后再运行就可以了
2.RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
错误信息具体如下:
Traceback (most recent call last):File "D:\pycharm\yolov5v2.0\train.py", line 469, in <module>train(hyp, tb_writer, opt, device)File "D:\pycharm\yolov5v2.0\train.py", line 291, in trainloss, loss_items = compute_loss(pred, targets.to(device), model) # scaled by batch_sizeFile "D:\pycharm\yolov5v2.0\utils\utils.py", line 443, in compute_losstcls, tbox, indices, anchors = build_targets(p, targets, model) # targetsFile "D:\pycharm\yolov5v2.0\utils\utils.py", line 532, in build_targetsa, t = at[j], t.repeat(na, 1, 1)[j] # filter
RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)
出现这个错误是因为在将张量传递给 build_targets
函数时,某些张量在不同的设备(CPU或GPU)上。为了修复这个错误,你需要确保所有相关的张量都在同一个设备上。
解决方法:将build_targets函数做出如下修改:
关键修改点:
- 获取目标张量的设备:在函数开始处获取
targets
张量的设备 (device = targets.device
)。 - 移动相关张量到相同设备:
anchors
张量在每次循环中被移到device
。- 在计算
gain
的时候,确保torch.tensor
调用使用了device=device
参数。 at
张量在创建时被移到device
def build_targets(p, targets, model):# Build targets for compute_loss(), input targets(image, class, x, y, w, h)device = targets.devicedet = model.module.model[-1] if isinstance(model, (nn.parallel.DataParallel, nn.parallel.DistributedDataParallel)) else model.model[-1] # Detect() modulena, nt = det.na, targets.shape[0] # number of anchors, targetstcls, tbox, indices, anch = [], [], [], []gain = torch.ones(6, device=device) # normalized to gridspace gainoff = torch.tensor([[1, 0], [0, 1], [-1, 0], [0, -1]], device=device).float() # overlap offsetsat = torch.arange(na, device=device).view(na, 1).repeat(1, nt) # anchor tensor, same as .repeat_interleave(nt)g = 0.5 # offsetstyle = 'rect4'for i in range(det.nl):anchors = det.anchors[i].to(device) # Move anchors to the same devicegain[2:] = torch.tensor(p[i].shape, device=device)[[3, 2, 3, 2]] # xyxy gain# Match targets to anchorsa, t, offsets = [], targets * gain, 0if nt:r = t[None, :, 4:6] / anchors[:, None] # wh ratioj = torch.max(r, 1. / r).max(2)[0] < model.hyp['anchor_t'] # compare# j = wh_iou(anchors, t[:, 4:6]) > model.hyp['iou_t'] # iou(3,n) = wh_iou(anchors(3,2), gwh(n,2))a, t = at[j], t.repeat(na, 1, 1)[j] # filter# overlapsgxy = t[:, 2:4] # grid xyz = torch.zeros_like(gxy)if style == 'rect2':j, k = ((gxy % 1. < g) & (gxy > 1.)).Ta, t = torch.cat((a, a[j], a[k]), 0), torch.cat((t, t[j], t[k]), 0)offsets = torch.cat((z, z[j] + off[0], z[k] + off[1]), 0) * gelif style == 'rect4':j, k = ((gxy % 1. < g) & (gxy > 1.)).Tl, m = ((gxy % 1. > (1 - g)) & (gxy < (gain[[2, 3]] - 1.))).Ta, t = torch.cat((a, a[j], a[k], a[l], a[m]), 0), torch.cat((t, t[j], t[k], t[l], t[m]), 0)offsets = torch.cat((z, z[j] + off[0], z[k] + off[1], z[l] + off[2], z[m] + off[3]), 0) * g# Defineb, c = t[:, :2].long().T # image, classgxy = t[:, 2:4] # grid xygwh = t[:, 4:6] # grid whgij = (gxy - offsets).long()gi, gj = gij.T # grid xy indices# Appendindices.append((b, a, gj, gi)) # image, anchor, grid indicestbox.append(torch.cat((gxy - gij, gwh), 1)) # boxanch.append(anchors[a]) # anchorstcls.append(c) # classreturn tcls, tbox, indices, anch
3.TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
报错信息具体如下:
Traceback (most recent call last):File "D:\pycharm\yolov5v2.0\train.py", line 469, in <module>train(hyp, tb_writer, opt, device)File "D:\pycharm\yolov5v2.0\train.py", line 340, in trainresults, maps, times = test.test(opt.data,File "D:\pycharm\yolov5v2.0\test.py", line 176, in testplot_images(img, output_to_target(output, width, height), paths, str(f), names) # predictionsFile "D:\pycharm\yolov5v2.0\utils\utils.py", line 914, in output_to_targetreturn np.array(targets)File "D:\Anaconda3\envs\yolov5\lib\site-packages\torch\_tensor.py", line 956, in __array__return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
这个错误是由于尝试将一个在CUDA设备上的张量直接转换为NumPy数组引起的。你需要首先将张量从CUDA设备移动到CPU,然后再进行转换。可以通过调用.cpu()
方法实现。
解决方法:修改 output_to_target
函数
def output_to_target(output, width, height):# Convert model output to target format [batch_id, class_id, x, y, w, h, conf]targets = []if isinstance(output, torch.Tensor):output = output.cpu().numpy()if isinstance(output, np.ndarray):output = [output] # 将单个 NumPy 数组封装到列表中,以便统一处理for i, o in enumerate(output):if o is not None:if isinstance(o, torch.Tensor):o = o.cpu().numpy() # 确保张量被转换为 NumPy 数组for pred in o:box = pred[:4]w = (box[2] - box[0]) / widthh = (box[3] - box[1]) / heightx = box[0] / width + w / 2y = box[1] / height + h / 2conf = pred[4]cls = int(pred[5])targets.append([i, cls, x, y, w, h, conf])return np.array(targets)
首先检查 output
是否是一个张量,如果是,将其转换为 NumPy 数组。将 output
封装到一个列表中(如果它是单个 NumPy 数组),以便统一处理。然后遍历 output
列表中的每个元素,并确保它们是 NumPy 数组。如果 output
列表中的某个元素是张量,将其转换为 NumPy 数组。最终将处理后的目标值添加到 targets
列表中并返回其 NumPy 数组。
检查gpu是否可用:
python -c "import torch; print(torch.cuda.is_available())"
返回True就是可用
可以将train.py中 --device', default=''中间添上0
parser.add_argument('--device', default='0', help='cuda device, i.e. 0 or 0,1,2,3 or cpu')