当前位置：首页 > wzjs >正文

韩城市网站建设局电话百一度一下你就知道

wzjs 2025/8/16 3:32:49

韩城市网站建设局电话,百一度一下你就知道,西安教育平台网站建设,行业网站推广怎么做1.缘起上周四下班时，发现在宿主机环境工作良好的既有的pytorch模型，在通过.pt->.onnx->.rknn的转换后无法正常工作。周五下班时，怀疑疑点在两处： 版本匹配问题通道和参数传递问题。周六，周日，周…

1.缘起

上周四下班时，发现在宿主机环境工作良好的既有的pytorch模型，在通过.pt->.onnx->.rknn的转换后无法正常工作。周五下班时，怀疑疑点在两处：

版本匹配问题
通道和参数传递问题。

周六，周日，周末时间，我将各个环境的pytorch版本修改为渐趋一致，并且参照了rknn2.3中的版本要求，对齐到同一个版本，并且怀疑training部分的版本与pt->rknn的环境不一致，我把两个环境融合到一个docker了。

training env:

torch 2.4.0+cpu
torchaudio 2.4.0+cpu
torchvision 0.19.0+cpu

ultralytics 8.3.68 /ultralytics
ultralytics-thop 2.0.14

rknn env:

torch 2.4.0+cpu
torchaudio 2.4.0+cpu
torchvision 0.19.0+cpu

在周日最后一次detect测试时，结果仍然是目标对象无法检出。下面针对这个问题，开展分析，尝试解决。

2.尝试1：增加模型精度yolov11n->yolov11s

周日最后，我启动了针对yolov11s.pt的训练，训练的数据集是一个测试数据集moonpie.这一次，我把pt->onnx->rknn的docker处理成了唯一的一个,epoch增大到250（batch=16, imgsz=640)

model=YOLO('yolo11.yaml').load('yolo11s.pt')

result = model.train(data=r'./moonpie.yaml', epochs=250, batch=16, imgsz=640, device='cpu')

最终的训练结果：

results_dict: {'metrics/precision(B)': 0.8658830071855359, 'metrics/recall(B)': 0.770949720670391, 'metrics/mAP50(B)': 0.8821807607242769, 'metrics/mAP50-95(B)': 0.6566184427590052, 'fitness': 0.6791746745555324}
save_dir: PosixPath('/app/rk3588_build/yolo_sdk/ultralytics/runs/detect/train4')
speed: {'preprocess': 0.9470678144885648, 'inference': 83.09212807686097, 'loss': 0.00011536382859753024, 'postprocess': 1.3212234743179814}
task: 'detect'

突然发现一件事，因为我周四开始的测试，是把新物体放到了第81个slot返回。难道是class_id detected的时候忘记处理它的大小了？要是的话，这个错误就太低级了。

2.1 直接做模拟环境的最终测试

step1. pt2onnx

from ultralytics import YOLO# Load a model
model = YOLO("/app/rk3588_build/last_moonpie_yolov11s.pt")  # load an official model
#model = YOLO(r"./best.pt")  # load a custom trained model
# Export the model
model.export(format="onnx")

Ultralytics 8.3.68 🚀 Python-3.8.10 torch-2.4.0+cpu CPU (unknown)
YOLO11 summary (fused): 238 layers, 2,617,701 parameters, 0 gradients, 6.5 GFLOPs

PyTorch: starting from '/app/rk3588_build/last_moonpie_yolov11s.pt' with input shape (1, 3, 640, 640) BCHW and output shape(s) (1, 85, 8400) (5.3 MB)

ONNX: starting export with onnx 1.17.0 opset 19...
ONNX: slimming with onnxslim 0.1.48...
ONNX: export success ✅ 0.6s, saved as '/app/rk3588_build/last_moonpie_yolov11s.onnx' (10.2 MB)

Export complete (0.9s)
Results saved to /app/rk3588_build
Predict: yolo predict task=detect model=/app/rk3588_build/last_moonpie_yolov11s.onnx imgsz=640
Validate: yolo val task=detect model=/app/rk3588_build/last_moonpie_yolov11s.onnx imgsz=640 data=./moonpie.yaml
Visualize: https://netron.app

step2. test rknn detect:

>>>>>>>>>>>>>>>original model: /app/rk3588_build/last_moonpie_yolov11s.onnx
--> Running model
I GraphPreparing : 100%|███████████████████████████████████████| 238/238 [00:00<00:00, 19028.68it/s]
I SessionPreparing : 100%|██████████████████████████████████████| 238/238 [00:00<00:00, 5188.41it/s]
target pic has no object concerned.

这一次我觉得我得把注意力集中到detect.py的语法上。

3.detect的语法：

    # Set inputsimg = cv2.imread(IMG_PATH)# img, ratio, (dw, dh) = letterbox(img, new_shape=(IMG_SIZE, IMG_SIZE))img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))print(f'>>>>>>>>>>>>>>>original model: {MODEL1}')# Inferenceprint('--> Running model')img2 = np.expand_dims(img, 0)outputs = rknn.inference(inputs=[img2], data_format=['nhwc'])

这里的BGR2RGB有些可疑，第二个疑点是data_format=nhwc

3.1 回顾训练时的通道设置：

yolo11.yaml发现一个重大疑点：

# Parameters

nc: 80 # number of classes

scales: # model compound scaling constants, i.e. 'model=yolo11n.yaml' will call yolo11.yaml with scale 'n'

# [depth, width, max_channels]

n: [0.50, 0.25, 1024] # summary: 319 layers, 2624080 parameters, 2624064 gradients, 6.6 GFLOPs

s: [0.50, 0.50, 1024] # summary: 319 layers, 9458752 parameters, 9458736 gradients, 21.7 GFLOPs

m: [0.50, 1.00, 512] # summary: 409 layers, 20114688 parameters, 20114672 gradients, 68.5 GFLOPs

l: [1.00, 1.00, 512] # summary: 631 layers, 25372160 parameters, 25372144 gradients, 87.6 GFLOPs

x: [1.00, 1.50, 512] # summary: 631 layers, 56966176 parameters, 56966160 gradients, 196.0 GFLOPs

# YOLO11n backbone

backbone:

# [from, repeats, module, args]

- [-1, 1, Conv, [64, 3, 2]] # 0-P1/2

nc的配置始终是80！然后里面只有Yolo11n的配置。再查。

Detect - Ultralytics YOLO Docs

在Yolo转换到onnx时，额外的参数如下：

3.1 Question1 onnx有没有对传入的颜色通道有限制？

这个似乎取决于训练过程中，读取imge的颜色通道，据说：使用 Python 的 OpenCV 库读取图像默认是 BGR 顺序，而 Pillow 库读取图像是 RGB 顺序。

yolov11- patches.py

# OpenCV Multilanguage-friendly functions ------------------------------------------------------------------------------
_imshow = cv2.imshow  # copy to avoid recursion errorsdef imread(filename: str, flags: int = cv2.IMREAD_COLOR):"""Read an image from a file.Args:filename (str): Path to the file to read.flags (int, optional): Flag that can take values of cv2.IMREAD_*. Defaults to cv2.IMREAD_COLOR.Returns:(np.ndarray): The read image."""return cv2.imdecode(np.fromfile(filename, np.uint8), flags)

然后，pillow在：

./ultralytics/data/loaders.py: # Load HEIC image using Pillow with pillow-heif
./ultralytics/data/loaders.py: check_requirements("pillow-heif")
./ultralytics/data/loaders.py: from pillow_heif import register_heif_opener

./ultralytics/cfg/datasets/ImageNet.yaml: 721: pillow
./ultralytics/cfg/datasets/ImageNet.yaml: n03938244: pillow
./ultralytics/cfg/datasets/lvis.yaml: 803: pillow

./pyproject.toml: "pillow>=7.1.2",

在loaders.py中有：

                    register_heif_opener()  # Register HEIF opener with Pillowwith Image.open(path) as img:im0 = cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR)  # convert image to BGR nparray

然后在yolov8的cv2.imread中有：

    def preprocess(self):"""Preprocesses the input image before performing inference.Returns:image_data: Preprocessed image data ready for inference."""# Read the input image using OpenCVself.img = cv2.imread(self.input_image)# Get the height and width of the input imageself.img_height, self.img_width = self.img.shape[:2]# Convert the image color space from BGR to RGBimg = cv2.cvtColor(self.img, cv2.COLOR_BGR2RGB)# Resize the image to match the input shapeimg = cv2.resize(img, (self.input_width, self.input_height))# Normalize the image data by dividing it by 255.0image_data = np.array(img) / 255.0# Transpose the image to have the channel dimension as the first dimensionimage_data = np.transpose(image_data, (2, 0, 1))  # Channel first# Expand the dimensions of the image data to match the expected input shapeimage_data = np.expand_dims(image_data, axis=0).astype(np.float32)# Return the preprocessed image datareturn image_data

结论：基本可以确定：最终的送入onnx的颜色通道是RGB。所以detect.py的颜色通道的处理没有错。

3.2 更详细的解释：

1. 颜色空间转换
img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
这行代码将图像从 BGR（蓝、绿、红）颜色空间转换为 RGB（红、绿、蓝）颜色空间。此时，图像数据是一个三维数组，形状为 (H, W, C)，其中 H 是图像高度，W 是图像宽度，C 是通道数（这里 C = 3，因为是 RGB 图像）。2. 图像尺寸调整
img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
这行代码将图像调整为指定的大小 (IMG_SIZE, IMG_SIZE)。调整后的图像仍然是三维数组，形状为 (IMG_SIZE, IMG_SIZE, 3)，依旧保持 HWC 的格式。3. 增加批量维度
img2 = np.expand_dims(img, 0)
np.expand_dims 函数用于在指定的轴上增加一个维度。这里在轴 0 上增加了一个维度，使得原本形状为 (IMG_SIZE, IMG_SIZE, 3) 的三维数组变成了形状为 (1, IMG_SIZE, IMG_SIZE, 3) 的四维数组。此时，新增加的第一个维度表示批量大小（N = 1），所以现在图像数据的格式变为 NHWC。4. 推理时指定格式
outputs = rknn.inference(inputs=[img2], data_format=['nhwc'])
这行代码调用 rknn 的推理函数，明确指定输入数据 img2 的格式为 NHWC。综上所述，经过上述一系列操作后，最终输入到 rknn.inference 函数中的数据 img2 是 NHWC 格式。

4.成功的案例

最终参照：

https://blog.csdn.net/zhangqian_1/article/details/142722526https://blog.csdn.net/zhangqian_1/article/details/142722526走通了。原理大概是yolov11，从.onnx的模型输出参数，到inference的输出参数，都与rknn-toolkit2.3版的那个yolo的detect不同。这些代码我需要弄清楚原理，文件里的修改并不完美，也没有附带说明。修改的项点如下：

4.1 最终的侦测识别代码

#!/usr/bin/env python3
# -*- coding:utf-8 -*-
import argparse
import os
import sys
import os.path as osp
import cv2
import torch
import numpy as np
import onnxruntime as ort
from math import expROOT = os.getcwd()
if str(ROOT) not in sys.path:sys.path.append(str(ROOT))#ONNX_MODEL = r'/app/rk3588_build/yolo_sdk/ultralytics/yolo11s.onnx'
ONNX_MODEL = r'/app/rk3588_build/yolo11_selfgen.onnx'
#ONNX_MODEL = 'yolov5s_relu.onnx'
#ONNX_MODEL= '/app/rk3588_build/last_moonpie.onnx'
#ONNX_MODEL= '/app/rk3588_build/last_moonpie_yolov11s.onnx'
#ONNX_MODEL= '/app/rk3588_build/best.onnx'
#PYTORCH_MODEL=r"/app/rk3588_build/yolo_sdk/ultralytics/best.pt" #driller model 走不通，版本太严格
RKNN_MODEL = r'/app/rk3588_build/rknn_models/sim_moonpie-640-640_rk3588.rknn'
#IMG_PATH = './frame_2266.png'
DATASET = './dataset.txt'
#IMG_PATH = './bus.jpg'
IMG_PATH = '/app/rk3588_build/cake26.jpg'
QUANTIZE_ON = FalseCLASSES = ['moonpie', 'bicycle', 'car', 'motorcycle', 'airplane', 'bus', 'train', 'truck', 'boat', 'traffic light','fire hydrant', 'stop sign', 'parking meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow','elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee','skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard','tennis racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple','sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'couch','potted plant', 'bed', 'dining table', 'toilet', 'tv', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone','microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear','hair drier', 'toothbrush']meshgrid = []class_num = len(CLASSES)
headNum = 3
strides = [8, 16, 32]
mapSize = [[80, 80], [40, 40], [20, 20]]
nmsThresh = 0.45
objectThresh = 0.5input_imgH = 640
input_imgW = 640class DetectBox:def __init__(self, classId, score, xmin, ymin, xmax, ymax):self.classId = classIdself.score = scoreself.xmin = xminself.ymin = yminself.xmax = xmaxself.ymax = ymaxdef GenerateMeshgrid():for index in range(headNum):for i in range(mapSize[index][0]):for j in range(mapSize[index][1]):meshgrid.append(j + 0.5)meshgrid.append(i + 0.5)def IOU(xmin1, ymin1, xmax1, ymax1, xmin2, ymin2, xmax2, ymax2):xmin = max(xmin1, xmin2)ymin = max(ymin1, ymin2)xmax = min(xmax1, xmax2)ymax = min(ymax1, ymax2)innerWidth = xmax - xmininnerHeight = ymax - ymininnerWidth = innerWidth if innerWidth > 0 else 0innerHeight = innerHeight if innerHeight > 0 else 0innerArea = innerWidth * innerHeightarea1 = (xmax1 - xmin1) * (ymax1 - ymin1)area2 = (xmax2 - xmin2) * (ymax2 - ymin2)total = area1 + area2 - innerAreareturn innerArea / totaldef NMS(detectResult):predBoxs = []sort_detectboxs = sorted(detectResult, key=lambda x: x.score, reverse=True)for i in range(len(sort_detectboxs)):xmin1 = sort_detectboxs[i].xminymin1 = sort_detectboxs[i].yminxmax1 = sort_detectboxs[i].xmaxymax1 = sort_detectboxs[i].ymaxclassId = sort_detectboxs[i].classIdif sort_detectboxs[i].classId != -1:predBoxs.append(sort_detectboxs[i])for j in range(i + 1, len(sort_detectboxs), 1):if classId == sort_detectboxs[j].classId:xmin2 = sort_detectboxs[j].xminymin2 = sort_detectboxs[j].yminxmax2 = sort_detectboxs[j].xmaxymax2 = sort_detectboxs[j].ymaxiou = IOU(xmin1, ymin1, xmax1, ymax1, xmin2, ymin2, xmax2, ymax2)if iou > nmsThresh:sort_detectboxs[j].classId = -1return predBoxsdef sigmoid(x):return 1 / (1 + exp(-x))def postprocess(out, img_h, img_w):print('postprocess ... ')detectResult = []output = []for i in range(len(out)):print(out[i].shape)output.append(out[i].reshape((-1)))scale_h = img_h / input_imgHscale_w = img_w / input_imgWgridIndex = -2cls_index = 0cls_max = 0for index in range(headNum):reg = output[index * 2 + 0]cls = output[index * 2 + 1]for h in range(mapSize[index][0]):for w in range(mapSize[index][1]):gridIndex += 2if 1 == class_num:cls_max = sigmoid(cls[0 * mapSize[index][0] * mapSize[index][1] + h * mapSize[index][1] + w])cls_index = 0else:for cl in range(class_num):cls_val = cls[cl * mapSize[index][0] * mapSize[index][1] + h * mapSize[index][1] + w]if 0 == cl:cls_max = cls_valcls_index = clelse:if cls_val > cls_max:cls_max = cls_valcls_index = clcls_max = sigmoid(cls_max)if cls_max > objectThresh:regdfl = []for lc in range(4):sfsum = 0locval = 0for df in range(16):temp = exp(reg[((lc * 16) + df) * mapSize[index][0] * mapSize[index][1] + h * mapSize[index][1] + w])reg[((lc * 16) + df) * mapSize[index][0] * mapSize[index][1] + h * mapSize[index][1] + w] = tempsfsum += tempfor df in range(16):sfval = reg[((lc * 16) + df) * mapSize[index][0] * mapSize[index][1] + h * mapSize[index][1] + w] / sfsumlocval += sfval * dfregdfl.append(locval)x1 = (meshgrid[gridIndex + 0] - regdfl[0]) * strides[index]y1 = (meshgrid[gridIndex + 1] - regdfl[1]) * strides[index]x2 = (meshgrid[gridIndex + 0] + regdfl[2]) * strides[index]y2 = (meshgrid[gridIndex + 1] + regdfl[3]) * strides[index]xmin = x1 * scale_wymin = y1 * scale_hxmax = x2 * scale_wymax = y2 * scale_hxmin = xmin if xmin > 0 else 0ymin = ymin if ymin > 0 else 0xmax = xmax if xmax < img_w else img_wymax = ymax if ymax < img_h else img_hbox = DetectBox(cls_index, cls_max, xmin, ymin, xmax, ymax)detectResult.append(box)# NMSprint('detectResult:', len(detectResult))predBox = NMS(detectResult)return predBoxdef precess_image(img_src, resize_w, resize_h):image = cv2.resize(img_src, (resize_w, resize_h), interpolation=cv2.INTER_LINEAR)image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)image = image.astype(np.float32)image /= 255.0return imagedef detect(img_path):orig = cv2.imread(img_path)img_h, img_w = orig.shape[:2]image = precess_image(orig, input_imgW, input_imgH)image = image.transpose((2, 0, 1))image = np.expand_dims(image, axis=0)# image = np.ones((1, 3, 384, 640), dtype=np.float32)# print(image.shape)ort_session = ort.InferenceSession(ONNX_MODEL)pred_results = (ort_session.run(None, {'data': image}))out = []for i in range(len(pred_results)):out.append(pred_results[i])predbox = postprocess(out, img_h, img_w)print('obj num is :', len(predbox))for i in range(len(predbox)):xmin = int(predbox[i].xmin)ymin = int(predbox[i].ymin)xmax = int(predbox[i].xmax)ymax = int(predbox[i].ymax)classId = predbox[i].classIdscore = predbox[i].scorecv2.rectangle(orig, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)ptext = (xmin, ymin)title = CLASSES[classId] + "%.2f" % scorecv2.putText(orig, title, ptext, cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 0, 255), 2, cv2.LINE_AA)cv2.imwrite('./test_onnx_result.jpg', orig)if __name__ == '__main__':print('This is main ....')GenerateMeshgrid()img_path = IMG_PATHdetect(img_path)

4.2 .pt2onnx的代码

#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# 获取当前脚本文件所在目录的父目录，并构建相对路径
import os
import sys
current_dir = os.path.dirname(os.path.abspath(__file__))
project_path = os.path.join(current_dir, '..')
sys.path.append(project_path)
sys.path.append(current_dir)
#based: https://docs.ultralytics.com/modes/export/#key-features-of-export-mode
from ultralytics import YOLO# Load a model
model = YOLO("/app/rk3588_build/last_moonpie_yolov11s.pt")  # load an official model
#model = YOLO("./best_moonpie.pt")  # load an official model
results = model(task='detect', source='../../cake26.jpg', save=True)  # predict on an image

4.1.1 关联修改1，修改yolov11-ultralytics源码： ./nn/head.py, 替换掉Detect.forward的代码

    def forward(self, x):#fengxh modified here. at Feb17,2025y = [] for i in range(self.nl):t1 = self.cv2[i](x[i])t2 = self.cv3[i](x[i])y.append(t1)y.append(t2)return y

4.1.2 关联修改2.修改onnx模型加载部分：./engine/model.py, 它重定义了.onnx输出模型参数：

      print("===================onnx====================")import torchdummy_input = torch.randn(1,3,640,640)input_names=['data']output_names=['reg1', 'cls1','reg2', 'cls2','reg3', 'cls3']torch.onnx.export(self.model, dummy_input, '/app/rk3588_build/yolo11_selfgen.onnx', verbose=False, input_names=input_names, output_names=output_names, opset_version=11)print("==================onnx self gened==========")