当前位置: 首页 > news >正文

使用vgpu_unlock在ubuntu 22.04上解锁GTX1060 (by quqi99)

作者:张华 发表于:2025-10-21
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
( http://blog.csdn.net/quqi99 )

本文将使用vgpu_unlock在ubuntu 22.04将GTX 1060解锁后支持vGPU实验.

1, 升级系统

sudo apt upgrade -y
sudo apt install -y git build-essential dkms unzip python3-pip mdevctl jq -y
#disable open source driver for N card
echo "blacklist nouveau" |sudo tee -a /etc/modprobe.d/disable-nouveau.conf
echo "options nouveau modeset=0" |sudo tee -a /etc/modprobe.d/disable-nouveau.conf
sudo update-initramfs -u
sudo reboot now

2, enable iommu

sudo find /sys/kernel/iommu_groups/ -type l | sorthua@x99:~$ cat /etc/default/grub |grep iommu
GRUB_CMDLINE_LINUX="transparent_hugepage=never hugepagesz=2M hugepages=128 default_hugepagesz=2M intel_iommu=on iommu=pt"

3, remove nvidia drivers.
4, 下载文件:

#用下面的文件
NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
原版驱动: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/NVIDIA-GRID-Linux-KVM-550.144.02-550.144.03-553.62/Host_Drivers/NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
创建的patch补丁: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/NVIDIA_550.144.02_vGPU.patch
libvgpu_unlock_rs.so插件: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/libvgpu_unlock_rs.so

5, 安装

#scp /bak/tools/vgpu/* hua@192.168.99.220:/bak/work/vgpu/
sudo chmod a+x ./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run --apply-patch NVIDIA_550.144.02_vGPU.patch
ls NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run
./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run  --extract-only --target NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom
cd NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom
./nvidia-installer -m kernelsudo mkdir /etc/vgpu_unlock
sudo touch /etc/vgpu_unlock/profile_override.toml
sudo chown -R $USER /etc/vgpu_unlock
sudo mkdir /etc/systemd/system/{nvidia-vgpud.service.d,nvidia-vgpu-mgr.service.d}
echo -e "[Service]\nEnvironment=LD_PRELOAD=/bak/work/vgpu/libvgpu_unlock_rs.so" |sudo tee /etc/systemd/system/nvidia-vgpud.service.d/vgpu_unlock.conf
echo -e "[Service]\nEnvironment=LD_PRELOAD=/bak/work/vgpu/libvgpu_unlock_rs.so" |sudo tee /etc/systemd/system/nvidia-vgpu-mgr.service.d/vgpu_unlock.conf
sudo systemctl daemon-reload
sudo systemctl restart {nvidia-vgpud.service,nvidia-vgpu-mgr.service}

注意, 使用’./nvidia-installer -m kernel’来安装, 若使用"./nvidia-installer --dkms"会报下列错.

#sudo ./nvidia-installer --dkms
#sudo ./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run
ERROR: Unable to load the kernel module 'nvidia.ko'.  This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA device(s), or no NVIDIA device installed in this system is supported by this NVIDIA Linux graphics driver release.                          Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.

6, 禁止内核自动更新.

hua@x99:~$  sudo apt-mark hold linux-image-$(uname -r) linux-headers-$(uname -r)
linux-image-6.8.0-85-generic set on hold.
linux-headers-6.8.0-85-generic set on hold.

7, 验证

hua@x99:~$ nvidia-smi vgpu
Tue Oct 21 14:36:55 2025       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.02             Driver Version: 550.144.02                |
|---------------------------------+------------------------------+------------+
| GPU  Name                       | Bus-Id                       | GPU-Util   |
|      vGPU ID     Name           | VM ID     VM Name            | vGPU-Util  |
|=================================+==============================+============|
|   0  NVIDIA GeForce GTX 106...  | 00000000:03:00.0             |   0%       |
|      3251634210  NVIDIA RTXA... | 2dbd...  test                |      0%    |
+---------------------------------+------------------------------+------------+
hua@x99:~$ nvidia-smi
Tue Oct 21 14:37:05 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.02             Driver Version: 550.144.02     CUDA Version: N/A      |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce GTX 1060 6GB    On  |   00000000:03:00.0 Off |                  N/A |
|  0%   20C    P8             16W /  120W |    1043MiB /   6144MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------++-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A     57040    C+G   vgpu                                         1014MiB |
+-----------------------------------------------------------------------------------------+$ mdevctl types |grep nvidia-760 -A4nvidia-760Available instances: 24Device API: vfio-pciName: NVIDIA RTXA5500-1BDescription: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=24$ ls -l /sys/class/mdev_bus
total 0
lrwxrwxrwx 1 root root 0 10月 21 12:16 0000:03:00.0 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0$ ls /sys/class/mdev_bus/0000\:03\:00.0/mdev_supported_types/
nvidia-760  nvidia-762  nvidia-764  nvidia-766  nvidia-768  nvidia-770  nvidia-772  nvidia-774  nvidia-776
nvidia-761  nvidia-763  nvidia-765  nvidia-767  nvidia-769  nvidia-771  nvidia-773  nvidia-775  nvidia-777

8, [可选]手动创建vGPU

cd /sys/class/mdev_bus/*/mdev_supported_types/nvidia-760
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ uuid=$(uuidgen)
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ echo $uuid
f8e3bb98-d797-4e6d-8b72-d205d8d399d3
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ ls
available_instances  create  description  device_api  devices  name
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ sudo sh -c "echo $uuid > create"
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ cd ../../$uuid
hua@x99:/sys/class/mdev_bus/0000:03:00.0/f8e3bb98-d797-4e6d-8b72-d205d8d399d3$ ls -l
total 0
lrwxrwxrwx 1 root root    0 10月 21 12:33 driver -> ../../../../../bus/mdev/drivers/nvidia-vgpu-vfio
lrwxrwxrwx 1 root root    0 10月 21 12:34 iommu_group -> ../../../../../kernel/iommu_groups/47
lrwxrwxrwx 1 root root    0 10月 21 12:34 mdev_type -> ../mdev_supported_types/nvidia-760
drwxr-xr-x 2 root root    0 10月 21 12:34 nvidia
drwxr-xr-x 2 root root    0 10月 21 12:34 power
--w------- 1 root root 4096 10月 21 12:34 remove
lrwxrwxrwx 1 root root    0 10月 21 12:33 subsystem -> ../../../../../bus/mdev
-rw-r--r-- 1 root root 4096 10月 21 12:33 uevent
drwxr-xr-x 3 root root    0 10月 21 12:33 vfio-dev

另一种创建vGPU的方式是:

uuid=$(uuidgen)
sudo mdevctl start -u $uuid -p 0000:03:00.0 --type nvidia-762
$ mdevctl list |grep 762
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762
sudo mdevctl define --auto --uuid $uuid
$ mdevctl list |grep 762
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762 (defined)

9, 各种用法.

#qemu
args: -device 'vfio-pci,sysfsdev=/sys/bus/mdev/devices/$uuid' -uuid <VMID>#libvirt
<hostdev mode='subsystem' type='mdev' model='vfio-pci'><source><address uuid='f8e3bb98-d797-4e6d-8b72-d205d8d399d3'/></source>
</hostdev>#openstack
openstack flavor set vgpu_1 --property "resources:VGPU=1"
openstack server create --flavor vgpu_1 --image cirros-0.3.5-x86_64-uec --wait test-vgpu

10, 实例. 注: 下面的例子在user-data这块有问题, 但无妨, ubuntu用户的默认密码是ubuntu, 用默认密码登录即可.

#make an iso image
sudo apt install -y virtinst cloud-image-utils qemu-utils qemu-kvm libvirt-daemon-system libvirt-clients virtinst ovmf whois
mkdir ~/test && cd ~/test
wget https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
qemu-img create -f qcow2 -F qcow2 -b noble-server-cloudimg-amd64.img test.qcow2 20G
qemu-img info test.qcow2
mkpasswd --method=SHA-512 --rounds=4096 --salt=salt12345 password
#it doesn't work, the default password is ubuntu
cat > user-data <<EOF      
#cloud-config
users:- name: ubuntusudo: ALL=(ALL) NOPASSWD:ALLshell: /bin/bashlock_passwd: falsepasswd: "$6$rounds=4096$salt12345$iwi3zE6QTLVkp.v8TkFOBT4DTrzgtNF5mxYDAq8Nl0/4ZphQSLZ8mlpoAFR8ZlbKXXW02DFxsw1XxvFU9.Sre/" 
chpasswd:expire: False
package_upgrade: true
packages:- pciutils- wget- build-essential- linux-headers-generic
runcmd:- echo "VM ready for NVIDIA mdev test" > /home/ubuntu/READY
EOF
touch meta-data
cloud-localds seed.iso user-data meta-data#create a test vm
#sudo virsh destroy test && sudo virsh undefine test --nvram
sudo chmod o+x /home/hua
sudo chmod o+x /home/hua/test
sudo chmod o+rw /home/hua/test/test.qcow2
sudo chmod o+r /home/hua/test/seed.iso
sudo virt-install --name test --ram 4096 --vcpus 2 \--disk path=./test.qcow2,device=disk,bus=virtio \--disk path=./seed.iso,device=cdrom \--os-variant ubuntu24.04 \--graphics none \--console pty,target_type=serial \--import \--network network=default \--boot loader=/usr/share/OVMF/OVMF_CODE.fd,loader.readonly=yes,loader.type=pflash,nvram.template=/usr/share/OVMF/OVMF_VARS.fd$ mdevctl list
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762 (defined)
f8e3bb98-d797-4e6d-8b72-d205d8d399d3 0000:03:00.0 nvidia-760#the default user/password is ubuntu/ubuntu
#sudo virsh console test 
sudo dhcpcd enp1s0
sudo virsh destroy test
sudo virsh edit test
<devices>
...
<hostdev mode='subsystem' type='mdev' model='vfio-pci'><source><address uuid='f8e3bb98-d797-4e6d-8b72-d205d8d399d3'/></source>
</hostdev>sudo virsh start test
ubuntu@ubuntu:~$ lspci -nn | grep -i nvidia
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102GL [RTX A5500] [10de:2233] (rev a1)

Reference

1, Ubuntu解锁N卡VGpu记录:解锁帧数限制,降压超频等 - https://blog.csdn.net/cheng1999_cn1/article/details/134731075
2, 如何给NVIDIA的vgpu-kvm驱动打补丁强开vGPU - https://foxi.buduanwang.vip/virtualization/pve/3417.html/

http://www.dtcms.com/a/511951.html

相关文章:

  • 北京制作网站的公司简介下载站源码cms
  • MySQL 8+ 日志管理与数据备份恢复实战指南
  • 【MySQL 数据库】MySQL用户管理
  • EXPLAIN执行计划详解
  • 【文档】配置 prometheus-webhook-dingtalk + alertmanager 细节
  • higress开通tcp和websocket网关
  • 国外优秀网站建设什么样的网站可以做外链
  • 【JavaWeb|第一篇】Maven篇
  • 如何上传网页到网站好玩网页传奇
  • 打造专属Spring Boot Starter
  • Elasticsearch面试精讲 Day 30:Elasticsearch面试真题解析与答题技巧
  • 单一key-value对象工具-org.apache.commons.lang3.tuple.Pair
  • h5游戏免费下载:3D小车车
  • 分布式事务详解
  • Flink重启策略有啥用
  • 怎样做好物流网站建设免费商业wordpress主题
  • 怎么把qq空间做成企业网站网站用ps下拉效果怎么做
  • 输电线路绝缘子污秽度在线监测装置工作原理及优势解析
  • MOSHELL (7) : 构建3G RNC端到端性能可观测性体系
  • UE5 使用Lyra地图加载插件完成简易Loading
  • 最好的家:干净、烟火与书香
  • 普集网站开发湛江有哪些网站建设公司
  • 青岛开发区做网站海外服务器ip免费
  • 华为OD-23届转行-C++面经
  • 做腰椎核磁证网站是 收 七php如何制作网页
  • tail-f
  • 卸载Python3.12.6报错0x80070643安装时发生严重错误
  • 『 数据库 』MySQL复习 - 内置函数详解
  • Linux中Expect脚本和Shell的脚本核心特点解析、以及比对分析和应用场景
  • 网站建设公司未来发展方向傻瓜式php网站开发