使用vgpu_unlock在ubuntu 22.04上解锁GTX1060 (by quqi99)
作者:张华 发表于:2025-10-21
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
( http://blog.csdn.net/quqi99 )
本文将使用vgpu_unlock在ubuntu 22.04将GTX 1060解锁后支持vGPU实验.
1, 升级系统
sudo apt upgrade -y
sudo apt install -y git build-essential dkms unzip python3-pip mdevctl jq -y
#disable open source driver for N card
echo "blacklist nouveau" |sudo tee -a /etc/modprobe.d/disable-nouveau.conf
echo "options nouveau modeset=0" |sudo tee -a /etc/modprobe.d/disable-nouveau.conf
sudo update-initramfs -u
sudo reboot now
2, enable iommu
sudo find /sys/kernel/iommu_groups/ -type l | sorthua@x99:~$ cat /etc/default/grub |grep iommu
GRUB_CMDLINE_LINUX="transparent_hugepage=never hugepagesz=2M hugepages=128 default_hugepagesz=2M intel_iommu=on iommu=pt"
3, remove nvidia drivers.
4, 下载文件:
#用下面的文件
NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
原版驱动: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/NVIDIA-GRID-Linux-KVM-550.144.02-550.144.03-553.62/Host_Drivers/NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
创建的patch补丁: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/NVIDIA_550.144.02_vGPU.patch
libvgpu_unlock_rs.so插件: https://alist.homelabproject.cc/d/foxipan/vGPU/17.5/libvgpu_unlock_rs.so
5, 安装
#scp /bak/tools/vgpu/* hua@192.168.99.220:/bak/work/vgpu/
sudo chmod a+x ./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run
./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm.run --apply-patch NVIDIA_550.144.02_vGPU.patch
ls NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run
./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run --extract-only --target NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom
cd NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom
./nvidia-installer -m kernelsudo mkdir /etc/vgpu_unlock
sudo touch /etc/vgpu_unlock/profile_override.toml
sudo chown -R $USER /etc/vgpu_unlock
sudo mkdir /etc/systemd/system/{nvidia-vgpud.service.d,nvidia-vgpu-mgr.service.d}
echo -e "[Service]\nEnvironment=LD_PRELOAD=/bak/work/vgpu/libvgpu_unlock_rs.so" |sudo tee /etc/systemd/system/nvidia-vgpud.service.d/vgpu_unlock.conf
echo -e "[Service]\nEnvironment=LD_PRELOAD=/bak/work/vgpu/libvgpu_unlock_rs.so" |sudo tee /etc/systemd/system/nvidia-vgpu-mgr.service.d/vgpu_unlock.conf
sudo systemctl daemon-reload
sudo systemctl restart {nvidia-vgpud.service,nvidia-vgpu-mgr.service}
注意, 使用’./nvidia-installer -m kernel’来安装, 若使用"./nvidia-installer --dkms"会报下列错.
#sudo ./nvidia-installer --dkms
#sudo ./NVIDIA-Linux-x86_64-550.144.02-vgpu-kvm-custom.run
ERROR: Unable to load the kernel module 'nvidia.ko'. This happens most frequently when this kernel module was built against the wrong or improperly configured kernel sources, with a version of gcc that differs from the one used to build the target kernel, or if another driver, such as nouveau, is present and prevents the NVIDIA kernel module from obtaining ownership of the NVIDIA device(s), or no NVIDIA device installed in this system is supported by this NVIDIA Linux graphics driver release. Please see the log entries 'Kernel module load error' and 'Kernel messages' at the end of the file '/var/log/nvidia-installer.log' for more information.
6, 禁止内核自动更新.
hua@x99:~$ sudo apt-mark hold linux-image-$(uname -r) linux-headers-$(uname -r)
linux-image-6.8.0-85-generic set on hold.
linux-headers-6.8.0-85-generic set on hold.
7, 验证
hua@x99:~$ nvidia-smi vgpu
Tue Oct 21 14:36:55 2025
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.02 Driver Version: 550.144.02 |
|---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util |
|=================================+==============================+============|
| 0 NVIDIA GeForce GTX 106... | 00000000:03:00.0 | 0% |
| 3251634210 NVIDIA RTXA... | 2dbd... test | 0% |
+---------------------------------+------------------------------+------------+
hua@x99:~$ nvidia-smi
Tue Oct 21 14:37:05 2025
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.144.02 Driver Version: 550.144.02 CUDA Version: N/A |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce GTX 1060 6GB On | 00000000:03:00.0 Off | N/A |
| 0% 20C P8 16W / 120W | 1043MiB / 6144MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------++-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 0 N/A N/A 57040 C+G vgpu 1014MiB |
+-----------------------------------------------------------------------------------------+$ mdevctl types |grep nvidia-760 -A4nvidia-760Available instances: 24Device API: vfio-pciName: NVIDIA RTXA5500-1BDescription: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=24$ ls -l /sys/class/mdev_bus
total 0
lrwxrwxrwx 1 root root 0 10月 21 12:16 0000:03:00.0 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0$ ls /sys/class/mdev_bus/0000\:03\:00.0/mdev_supported_types/
nvidia-760 nvidia-762 nvidia-764 nvidia-766 nvidia-768 nvidia-770 nvidia-772 nvidia-774 nvidia-776
nvidia-761 nvidia-763 nvidia-765 nvidia-767 nvidia-769 nvidia-771 nvidia-773 nvidia-775 nvidia-777
8, [可选]手动创建vGPU
cd /sys/class/mdev_bus/*/mdev_supported_types/nvidia-760
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ uuid=$(uuidgen)
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ echo $uuid
f8e3bb98-d797-4e6d-8b72-d205d8d399d3
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ ls
available_instances create description device_api devices name
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ sudo sh -c "echo $uuid > create"
hua@x99:/sys/class/mdev_bus/0000:03:00.0/mdev_supported_types/nvidia-760$ cd ../../$uuid
hua@x99:/sys/class/mdev_bus/0000:03:00.0/f8e3bb98-d797-4e6d-8b72-d205d8d399d3$ ls -l
total 0
lrwxrwxrwx 1 root root 0 10月 21 12:33 driver -> ../../../../../bus/mdev/drivers/nvidia-vgpu-vfio
lrwxrwxrwx 1 root root 0 10月 21 12:34 iommu_group -> ../../../../../kernel/iommu_groups/47
lrwxrwxrwx 1 root root 0 10月 21 12:34 mdev_type -> ../mdev_supported_types/nvidia-760
drwxr-xr-x 2 root root 0 10月 21 12:34 nvidia
drwxr-xr-x 2 root root 0 10月 21 12:34 power
--w------- 1 root root 4096 10月 21 12:34 remove
lrwxrwxrwx 1 root root 0 10月 21 12:33 subsystem -> ../../../../../bus/mdev
-rw-r--r-- 1 root root 4096 10月 21 12:33 uevent
drwxr-xr-x 3 root root 0 10月 21 12:33 vfio-dev
另一种创建vGPU的方式是:
uuid=$(uuidgen)
sudo mdevctl start -u $uuid -p 0000:03:00.0 --type nvidia-762
$ mdevctl list |grep 762
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762
sudo mdevctl define --auto --uuid $uuid
$ mdevctl list |grep 762
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762 (defined)
9, 各种用法.
#qemu
args: -device 'vfio-pci,sysfsdev=/sys/bus/mdev/devices/$uuid' -uuid <VMID>#libvirt
<hostdev mode='subsystem' type='mdev' model='vfio-pci'><source><address uuid='f8e3bb98-d797-4e6d-8b72-d205d8d399d3'/></source>
</hostdev>#openstack
openstack flavor set vgpu_1 --property "resources:VGPU=1"
openstack server create --flavor vgpu_1 --image cirros-0.3.5-x86_64-uec --wait test-vgpu
10, 实例. 注: 下面的例子在user-data这块有问题, 但无妨, ubuntu用户的默认密码是ubuntu, 用默认密码登录即可.
#make an iso image
sudo apt install -y virtinst cloud-image-utils qemu-utils qemu-kvm libvirt-daemon-system libvirt-clients virtinst ovmf whois
mkdir ~/test && cd ~/test
wget https://cloud-images.ubuntu.com/noble/current/noble-server-cloudimg-amd64.img
qemu-img create -f qcow2 -F qcow2 -b noble-server-cloudimg-amd64.img test.qcow2 20G
qemu-img info test.qcow2
mkpasswd --method=SHA-512 --rounds=4096 --salt=salt12345 password
#it doesn't work, the default password is ubuntu
cat > user-data <<EOF
#cloud-config
users:- name: ubuntusudo: ALL=(ALL) NOPASSWD:ALLshell: /bin/bashlock_passwd: falsepasswd: "$6$rounds=4096$salt12345$iwi3zE6QTLVkp.v8TkFOBT4DTrzgtNF5mxYDAq8Nl0/4ZphQSLZ8mlpoAFR8ZlbKXXW02DFxsw1XxvFU9.Sre/"
chpasswd:expire: False
package_upgrade: true
packages:- pciutils- wget- build-essential- linux-headers-generic
runcmd:- echo "VM ready for NVIDIA mdev test" > /home/ubuntu/READY
EOF
touch meta-data
cloud-localds seed.iso user-data meta-data#create a test vm
#sudo virsh destroy test && sudo virsh undefine test --nvram
sudo chmod o+x /home/hua
sudo chmod o+x /home/hua/test
sudo chmod o+rw /home/hua/test/test.qcow2
sudo chmod o+r /home/hua/test/seed.iso
sudo virt-install --name test --ram 4096 --vcpus 2 \--disk path=./test.qcow2,device=disk,bus=virtio \--disk path=./seed.iso,device=cdrom \--os-variant ubuntu24.04 \--graphics none \--console pty,target_type=serial \--import \--network network=default \--boot loader=/usr/share/OVMF/OVMF_CODE.fd,loader.readonly=yes,loader.type=pflash,nvram.template=/usr/share/OVMF/OVMF_VARS.fd$ mdevctl list
660da331-0255-4ad0-8e81-e329df393ccc 0000:03:00.0 nvidia-762 (defined)
f8e3bb98-d797-4e6d-8b72-d205d8d399d3 0000:03:00.0 nvidia-760#the default user/password is ubuntu/ubuntu
#sudo virsh console test
sudo dhcpcd enp1s0
sudo virsh destroy test
sudo virsh edit test
<devices>
...
<hostdev mode='subsystem' type='mdev' model='vfio-pci'><source><address uuid='f8e3bb98-d797-4e6d-8b72-d205d8d399d3'/></source>
</hostdev>sudo virsh start test
ubuntu@ubuntu:~$ lspci -nn | grep -i nvidia
07:00.0 VGA compatible controller [0300]: NVIDIA Corporation GA102GL [RTX A5500] [10de:2233] (rev a1)
Reference
1, Ubuntu解锁N卡VGpu记录:解锁帧数限制,降压超频等 - https://blog.csdn.net/cheng1999_cn1/article/details/134731075
2, 如何给NVIDIA的vgpu-kvm驱动打补丁强开vGPU - https://foxi.buduanwang.vip/virtualization/pve/3417.html/