当前位置：首页 > news >正文

RuntimeError: CUDA error: invalid device function

news 来源：原创 2025/6/11 18:37:50

CUDA内核编译时的架构设置与当前GPU不兼容导致

-- The CUDA compiler identification is NVIDIA 11.5.119 （实际为 12.6）

解决方案：

1. 查看显卡计算能力

2. CMakeLists.txt 修改

    set_target_properties(my_library PROPERTIES
        CUDA_ARCHITECTURES 89  # 关键修复点
    )

3. 修改Makefile (添加 -DCMAKE_CUDA_ARCHITECTURES=89，明确指定支持 sm_89

build:
	mkdir -p build
	cmake -Bbuild -DCMAKE_BUILD_TYPE=$(TYPE) -DUSE_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=89
	make -j -C build

4. 删除老版本 nvcc （旧版本的路径在`/usr/bin`下，而新版本可能在`/usr/local/cuda-12.6/bin）

命令:

# 删除旧 CUDA 11.5 的 nvcc 和工具链（谨慎操作！）
sudo rm /usr/bin/nvcc
# 或者更安全的方式：通过 apt 卸载旧 CUDA 包（如果通过 apt 安装）
sudo apt purge cuda-toolkit-11-5

结果：

-- The CUDA compiler identification is NVIDIA 12.6.85

问题解决，成功完成测试

ubun22:/mnt/c/Users/lms/Desktop/cuda$  /usr/bin/env /usr/bin/python /home/ubun22/.vscode-server/extensions/ms-python.debugpy-2025.6.0-linux-x64/bundled/libs/debugpy/adapter/../../debugpy/launcher 48006 -- /mnt/c/Users/lms/Desktop/cuda/hpc2torch1/test/gather.py --device cuda 
Testing Gather on cuda with x_shape:(3, 2) , indice_shape:(2, 2), axis:0 ,dtype:torch.float32
2025-04-09 20:36:15,539 Pytorch: 0.04505600035190582 ms, kernel: 0.03885599970817566 ms [-13.76%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(3, 2) , indice_shape:(1, 2), axis:1 ,dtype:torch.float32
2025-04-09 20:36:15,547 Pytorch: 0.04580639898777008 ms, kernel: 0.042956799268722534 ms [-6.22%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(3, 2, 3) , indice_shape:(1, 2, 1), axis:1 ,dtype:torch.float32
2025-04-09 20:36:15,552 Pytorch: 0.029182401299476624 ms, kernel: 0.021587200462818146 ms [-26.03%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(50257, 768) , indice_shape:(16, 1024), axis:0 ,dtype:torch.float32
2025-04-09 20:36:15,596 Pytorch: 0.40238080024719236 ms, kernel: 0.4157440185546875 ms [+3.32%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(1024, 512, 32, 4) , indice_shape:(128, 2, 2), axis:1 ,dtype:torch.float32
2025-04-09 20:36:16,877 Pytorch: 1.8170879364013672 ms, kernel: 1.8123775482177735 ms [-0.26%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(3, 2) , indice_shape:(2, 2), axis:0 ,dtype:torch.float16
2025-04-09 20:36:22,590 Pytorch: 0.014347200095653535 ms, kernel: 0.01908479928970337 ms [+33.02%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(3, 2) , indice_shape:(1, 2), axis:1 ,dtype:torch.float16
2025-04-09 20:36:22,593 Pytorch: 0.02247679978609085 ms, kernel: 0.00979039967060089 ms [-56.44%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(50257, 768) , indice_shape:(16, 1024), axis:0 ,dtype:torch.float16
2025-04-09 20:36:22,613 Pytorch: 0.2537472009658813 ms, kernel: 0.20623359680175782 ms [-18.72%]
absolute error:0.0000e+00
relative error:0.0000e+00
Testing Gather on cuda with x_shape:(512, 128, 4, 128) , indice_shape:(4, 1, 1), axis:2 ,dtype:torch.float16
2025-04-09 20:36:23,894 Pytorch: 0.6146048069000244 ms, kernel: 0.6029823780059814 ms [-1.89%]
absolute error:0.0000e+00
relative error:0.0000e+00
ubun22:/mnt/c/Users/lms/Desktop/cuda$