19 Commits

Author SHA1 Message Date
Kenny.ch
640a834483 Merge branch 'master' into 'master'
Fixed Guide for Security, Stability, and Sanity

See merge request polloloco/vgpu-proxmox!6
2024-02-15 17:24:35 +00:00
Kenny.ch
8e85d32eb0 Fixed Guide for security / stability and sanity 2024-02-15 17:22:10 +00:00
PolloLoco
3110f37d80 14.x and 15.x driver versions are EOL. Add note about possible issue with display dummy plugs 2023-12-21 12:03:25 +01:00
PolloLoco
0c1d8e6bea Update guide with 16.2 (535.129.03) patch 2023-11-08 13:08:06 +01:00
PolloLoco
028d78af09 Update guide with 16.1 (535.104.06) patch 2023-09-23 11:28:34 +02:00
PolloLoco
9e3df0bdff Update guide to Proxmox VE 8.0 and vGPU 16.0 drivers 2023-07-10 20:52:37 +02:00
PolloLoco
e2955b232a Remove spoofing instructions. Please don't use spoofing anymore :) 2023-05-29 18:46:57 +02:00
PolloLoco
a577cc6625 Convert CRLF to LF 2023-03-29 20:13:10 +02:00
PolloLoco
df93332349 Update to pve 7.4 2023-03-29 20:12:16 +02:00
PolloLoco
ba197fb9ac Add buymeacoffee link 2023-01-24 21:05:02 +01:00
PolloLoco
646e599bce Add support for different driver versions 2023-01-24 20:40:41 +01:00
PolloLoco
0e51ef508e Simplify instructions, no need to mess around with any uuids anymore 2023-01-15 13:34:59 +01:00
PolloLoco
dcf58742b8 Fix typo 2023-01-04 22:52:39 +01:00
PolloLoco
5ec737e1a3 Add better examples for VRAM sizes 2023-01-04 21:57:20 +01:00
PolloLoco
d1009fd47a Add small video tutorial on how to obtain the right driver from the nvidia portal 2022-12-28 12:17:40 +01:00
PolloLoco
8cef2c6082 Fix typo (wrong version number) 2022-12-28 12:06:13 +01:00
PolloLoco
ea99035a5b Fix 'malformed patch' error when applying the patch 2022-12-04 15:38:44 +01:00
PolloLoco
ba4b4b4787 Update guide to 15.0 - 525.60.12 2022-12-04 13:09:21 +01:00
PolloLoco
22bd687e6d Update guide to 14.3 - 510.108.03 2022-11-24 22:15:30 +01:00
10 changed files with 641 additions and 548 deletions

BIN
510.108.03.patch Normal file

Binary file not shown.

Binary file not shown.

BIN
525.60.12.patch Normal file

Binary file not shown.

BIN
525.85.07.patch Normal file

Binary file not shown.

BIN
535.104.06.patch Normal file

Binary file not shown.

BIN
535.129.03.patch Normal file

Binary file not shown.

BIN
535.54.06.patch Normal file

Binary file not shown.

363
README.md
View File

@@ -1,20 +1,45 @@
# NVIDIA vGPU with the GRID 14.2 driver # NVIDIA vGPU on Proxmox
A few days ago, NVIDIA released their latest enterprise GRID driver. I created a patch that allows the use of most consumer GPUs for vGPU. One notable exception from that list is every officially unsupported Ampere GPU. [!["Buy Me A Coffee"](https://www.buymeacoffee.com/assets/img/custom_images/orange_img.png)](https://www.buymeacoffee.com/polloloco)
This document serves as a guide to install NVIDIA vGPU host drivers on the latest Proxmox VE version, at time of writing this its pve 8.1.
You can follow this guide if you have a vGPU supported card from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html), or if you are using a consumer GPU from the GeForce series or a non-vGPU qualified Quadro GPU. There are several sections with a title similar to "Have a vGPU supported GPU? Read here" in this document, make sure to read those very carefully as this is where the instructions differ for a vGPU qualified card and a consumer card.
## Supported cards
The following consumer/not-vGPU-qualified NVIDIA GPUs can be used with vGPU:
- Most GPUs from the Maxwell 2.0 generation (GTX 9xx, Quadro Mxxxx, Tesla Mxx) **EXCEPT the GTX 970**
- All GPUs from the Pascal generation (GTX 10xx, Quadro Pxxxx, Tesla Pxx)
- All GPUs from the Turing generation (GTX 16xx, RTX 20xx, Txxxx)
If you have GPUs from the Ampere and Ada Lovelace generation, you are out of luck, unless you have a vGPU qualified card from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html) like the A5000 or RTX 6000 Ada. If you have one of those cards, please consult the [NVIDIA documentation](https://docs.nvidia.com/grid/15.0/grid-vgpu-user-guide/index.html) for help with setting it up.
> **!!! THIS MEANS THAT YOUR RTX 30XX or 40XX WILL NOT WORK !!!**
This guide and all my tests were done on a RTX 2080 Ti which is based on the Turing architechture. This guide and all my tests were done on a RTX 2080 Ti which is based on the Turing architechture.
### This tutorial assumes you are using a clean install of Proxmox 7.2, or ymmv when using an existing installation. Make sure to always have backups! ## Important notes before starting
- This tutorial assumes you are using a clean install of Proxmox VE 8.1.
- If you are using Proxmox VE 8.1, you **MUST** use 16.x drivers. Older versions only work with pve 7
- If you tried GPU-passthrough before, you absolutely **MUST** revert all of the steps you did to set that up.
- If you only have one GPU in your system with no iGPU, your local monitor will **NOT** give you any output anymore after the system boots up. Use SSH or a serial connection if you want terminal access to your machine.
- Most of the steps can be applied to other linux distributions, however I'm only covering Proxmox VE here.
The patch included in this repository should work on other linux systems with kernel versions 5.13 to 5.16 but I have only tested it on the current proxmox version. > ## Are you upgrading from a previous version of this guide?
If you are not using proxmox, you have to adapt some parts of this tutorial to work for your distribution. >
> If you are upgrading from a previous version of this guide, you should uninstall the old driver by running `nvidia-uninstall` first.
>
> Then you also have to make sure that you are using the latest version of `vgpu_unlock-rs`, otherwise it won't work with the latest driver.
>
> Either delete the folder `/opt/vgpu_unlock-rs` or enter the folder and run `git pull` and then recompile the library again using `cargo build --release`
## Packages ## Packages
Make sure to add the community pve repo and get rid of the enterprise repo (you can skip this step if you have a valid enterprise subscription) Make sure to add the community pve repo and get rid of the enterprise repo (you can skip this step if you have a valid enterprise subscription)
```bash ```bash
echo "deb http://download.proxmox.com/debian/pve bullseye pve-no-subscription" >> /etc/apt/sources.list echo "deb http://download.proxmox.com/debian/pve bookworm pve-no-subscription" >> /etc/apt/sources.list
rm /etc/apt/sources.list.d/pve-enterprise.list rm /etc/apt/sources.list.d/pve-enterprise.list
``` ```
@@ -44,7 +69,7 @@ git clone https://github.com/mbilker/vgpu_unlock-rs.git
After that, install the rust compiler After that, install the rust compiler
```bash ```bash
curl https://sh.rustup.rs -sSf | sh -s -- -y curl https://sh.rustup.rs -sSf | sh -s -- -y --profile minimal
``` ```
Now make the rust binaries available in your $PATH (you only have to do it the first time after installing rust) Now make the rust binaries available in your $PATH (you only have to do it the first time after installing rust)
@@ -110,17 +135,20 @@ Depending on which system you are using to boot, you have to chose from the foll
If you are using an Intel system, append this after `quiet`: If you are using an Intel system, append this after `quiet`:
``` ```
intel_iommu=on iommu=pt intel_iommu=on
``` ```
On AMD systems, append this after `quiet`: On AMD systems, you don't have to add anything and amd_iommu=on does not exist:
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html?highlight=amd_iommu
For either AMD or Intel there is an option incase you have heavy performance issues, but with the lose of security and stability of the system:
``` ```
amd_iommu=on iommu=pt iommu=pt
``` ```
The result should look like this (for intel systems): The result should look like this (for intel systems):
``` ```
GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on iommu=pt" GRUB_CMDLINE_LINUX_DEFAULT="quiet intel_iommu=on"
``` ```
Now, save and exit from the editor using Ctrl+O and then Ctrl+X and then apply your changes: Now, save and exit from the editor using Ctrl+O and then Ctrl+X and then apply your changes:
@@ -144,17 +172,20 @@ Depending on which system you are using to boot, you have to chose from the foll
On Intel systems, append this at the end On Intel systems, append this at the end
``` ```
intel_iommu=on iommu=pt intel_iommu=on
``` ```
For AMD, use this On AMD systems, you don't have to add anything and amd_iommu=on does not exist:
https://www.kernel.org/doc/html/latest/admin-guide/kernel-parameters.html?highlight=amd_iommu
For either AMD or Intel there is an option incase you have heavy performance issues, but with the lose of security and stability of the system:
``` ```
amd_iommu=on iommu=pt iommu=pt
``` ```
After editing the file, it should look similar to this After editing the file, it should look similar to this
``` ```
root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on iommu=pt root=ZFS=rpool/ROOT/pve-1 boot=zfs intel_iommu=on
``` ```
Now, save and exit from the editor using Ctrl+O and then Ctrl+X and then apply your changes: Now, save and exit from the editor using Ctrl+O and then Ctrl+X and then apply your changes:
@@ -228,7 +259,23 @@ Depending on your mainboard and cpu, the output will be different, in my output
## NVIDIA Driver ## NVIDIA Driver
As of the time of this writing (August 2022), the latest available GRID driver is 14.2 with vGPU driver version 510.85.03. You can check for the latest version [here](https://docs.nvidia.com/grid/). I cannot guarantee that newer versions would work without additional patches, this tutorial only covers 14.2 (510.85.03). This repo contains patches that allow you to use vGPU on not-qualified-vGPU cards (consumer GPUs). Those patches are binary patches, which means that each patch works **ONLY** for a specific driver version.
I've created patches for the following driver versions:
- 16.2 (535.129.03) - Use this if you are on pve 8.1 (kernel 6.2, 6.5 should work too)
- 16.1 (535.104.06)
- 16.0 (535.54.06)
> ### The following versions are EOL, don't use them unless you have a very specific reason!
> - 15.1 (525.85.07)
> - 15.0 (525.60.12)
> - 14.4 (510.108.03)
> - 14.3 (510.108.03)
> - 14.2 (510.85.03)
You can choose which of those you want to use, but generally its recommended to use the latest, most up-to-date version (16.2 in this case).
If you have a vGPU qualified GPU, you can use other versions too, because you don't need to patch the driver. However, you still have to make sure they are compatible with your proxmox version and kernel. Also I would not recommend using any older versions unless you have a very specific requirement.
### Obtaining the driver ### Obtaining the driver
@@ -236,27 +283,28 @@ NVIDIA doesn't let you freely download vGPU drivers like they do with GeForce or
NB: When applying for an eval license, do NOT use your personal email or other email at a free email provider like gmail.com. You will probably have to go through manual review if you use such emails. I have very good experience using a custom domain for my email address, that way the automatic verification usually lets me in after about five minutes. NB: When applying for an eval license, do NOT use your personal email or other email at a free email provider like gmail.com. You will probably have to go through manual review if you use such emails. I have very good experience using a custom domain for my email address, that way the automatic verification usually lets me in after about five minutes.
The file you are looking for is called `NVIDIA-GRID-Linux-KVM-510.85.03-510.85.02-513.46.zip`, you can get it from the download portal by downloading version 14.2 for `Linux KVM`. I've created a small video tutorial to find the right driver version on the NVIDIA Enterprise Portal. In the video I'm downloading the 15.0 driver, if you want a different one just replace 15.0 with the version you want:
For those who want to find the file somewhere else, here are some checksums :) ![Video Tutorial to find the right driver](downloading_driver.mp4)
```
sha1: 468912059ca86aaa737588c9b92a1f8bfaa071bd
md5: bb330fa7f26e11bebeadefdee9c71e84
```
After downloading, extract that and copy the file `NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run` to your Proxmox host into the `/root/` folder After downloading, extract the zip file and then copy the file called `NVIDIA-Linux-x86_64-DRIVERVERSION-vgpu-kvm.run` (where DRIVERVERSION is a string like `535.129.03`) from the `Host_Drivers` folder to your Proxmox host into the `/root/` folder using tools like FileZilla, WinSCP, scp or rsync.
```bash
scp NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run root@pve:/root/ ### ⚠️ From here on, I will be using the 16.2 driver, but the steps are the same for other driver versions
```
For example when I run a command like `chmod +x NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm.run`, you should replace `535.129.03` with the driver version you are using (if you are using a different one). You can get the list of version numbers [here](#nvidia-driver).
Every step where you potentially have to replace the version name will have this warning emoji next to it: ⚠️
> ### Have a vgpu supported card? Read here! > ### Have a vgpu supported card? Read here!
> >
> If you don't have a card like the Tesla P4, or any other gpu from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html), please continue reading at [Patching the driver](#patching-the-driver) > If you don't have a card like the Tesla P4, or any other gpu from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html), please continue reading at [Patching the driver](#patching-the-driver)
> >
> With a supported gpu, patching the driver is not needed, so you should skip the next section. You can simply install the driver package like this: > With a supported gpu, patching the driver is not needed, so you should skip the next section. You can simply install the driver package like this:
>
> ⚠️
> ```bash > ```bash
> chmod +x NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run > chmod +x NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm.run
> ./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run --dkms > ./NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm.run --dkms
> ``` > ```
> >
> To finish the installation, reboot the system > To finish the installation, reboot the system
@@ -269,26 +317,32 @@ scp NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run root@pve:/root/
### Patching the driver ### Patching the driver
Now, on the proxmox host, make the driver executable Now, on the proxmox host, make the driver executable
⚠️
```bash ```bash
chmod +x NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run chmod +x NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm.run
``` ```
And then patch it And then patch it
⚠️
```bash ```bash
./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm.run --apply-patch ~/vgpu-proxmox/510.85.03.patch ./NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm.run --apply-patch ~/vgpu-proxmox/535.129.03.patch
``` ```
That should output a lot of lines ending with That should output a lot of lines ending with
``` ```
Self-extractible archive "NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run" successfully created. Self-extractible archive "NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm-custom.run" successfully created.
``` ```
You should now have a file called `NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run`, that is your patched driver. You should now have a file called `NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm-custom.run`, that is your patched driver.
### Installing the driver ### Installing the driver
Now that the required patch is applied, you can install the driver Now that the required patch is applied, you can install the driver
⚠️
```bash ```bash
./NVIDIA-Linux-x86_64-510.85.03-vgpu-kvm-custom.run --dkms ./NVIDIA-Linux-x86_64-535.129.03-vgpu-kvm-custom.run --dkms
``` ```
The installer will ask you `Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later.`, answer with `Yes`. The installer will ask you `Would you like to register the kernel module sources with DKMS? This will allow DKMS to automatically build a new module, if you install a different kernel later.`, answer with `Yes`.
@@ -297,7 +351,7 @@ Depending on your hardware, the installation could take a minute or two.
If everything went right, you will be presented with this message. If everything went right, you will be presented with this message.
``` ```
Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 510.85.03) is now complete. Installation of the NVIDIA Accelerated Graphics Driver for Linux-x86_64 (version: 535.129.03) is now complete.
``` ```
Click `Ok` to exit the installer. Click `Ok` to exit the installer.
@@ -316,9 +370,9 @@ nvidia-smi
You should get an output similar to this one You should get an output similar to this one
``` ```
Sun Aug 7 21:26:58 2022 Tue Jan 24 20:21:28 2023
+-----------------------------------------------------------------------------+ +-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.03 Driver Version: 510.85.03 CUDA Version: N/A | | NVIDIA-SMI 525.85.07 Driver Version: 525.85.07 CUDA Version: N/A |
|-------------------------------+----------------------+----------------------+ |-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
@@ -366,26 +420,16 @@ The output will be similar to this
If this command doesn't return any output, vGPU unlock isn't working. If this command doesn't return any output, vGPU unlock isn't working.
### Bonus: working `nvidia-smi vgpu` command Another command you can try to see if your card is recognized as being vgpu enabled is this one:
```bash
> ### Have a vgpu supported card? Read here! nvidia-smi vgpu
>
> If you have a card like the Tesla P4, or any other gpu from [this list](https://docs.nvidia.com/grid/gpus-supported-by-vgpu.html), you should skip this section, as `nvidia-smi vgpu` is already working
>
> You should continue reading at [vGPU overrides](#vgpu-overrides)
I've included an adapted version of the `nvidia-smi` [wrapper script](https://github.com/erin-allison/nvidia-merged-arch/blob/d2ce752cd38461b53b7e017612410a3348aa86e5/nvidia-smi) to get useful output from `nvidia-smi vgpu`.
Without that wrapper script, running `nvidia-smi vgpu` in your shell results in this output
```
No supported devices in vGPU mode
``` ```
With the wrapper script, the output looks similar to this If everything worked right with the unlock, the output should be similar to this:
``` ```
Sun Aug 7 21:27:04 2022 Tue Jan 24 20:21:43 2023
+-----------------------------------------------------------------------------+ +-----------------------------------------------------------------------------+
| NVIDIA-SMI 510.85.03 Driver Version: 510.85.03 | | NVIDIA-SMI 525.85.07 Driver Version: 525.85.07 |
|---------------------------------+------------------------------+------------+ |---------------------------------+------------------------------+------------+
| GPU Name | Bus-Id | GPU-Util | | GPU Name | Bus-Id | GPU-Util |
| vGPU ID Name | VM ID VM Name | vGPU-Util | | vGPU ID Name | VM ID VM Name | vGPU-Util |
@@ -394,16 +438,12 @@ Sun Aug 7 21:27:04 2022
+---------------------------------+------------------------------+------------+ +---------------------------------+------------------------------+------------+
``` ```
To install this script, copy the `nvidia-smi` file from this repo to `/usr/local/bin` and make it executable However, if you get this output, then something went wrong
```bash ```
cp ~/vgpu-proxmox/nvidia-smi /usr/local/bin/ No supported devices in vGPU mode
chmod +x /usr/local/bin/nvidia-smi
``` ```
Run this in your shell (you might have to logout and back in first) to see if it worked If any of those commands give the wrong output, you cannot continue. Please make sure to read everything here very carefully and when in doubt, create an issue or join the [discord server](#support) and ask for help there.
```bash
nvidia-smi vgpu
```
## vGPU overrides ## vGPU overrides
@@ -411,6 +451,14 @@ Further up we have created the file `/etc/vgpu_unlock/profile_override.toml` and
If we take a look at the output of `mdevctl types` we see lots of different types that we can choose from. However, if we for example chose `GRID RTX6000-4Q` which gives us 4GB of vram in a VM, we are locked to that type for all of our VMs. Meaning we can only have 4GB VMs, its not possible to mix different types to have one 4GB VM, and two 2GB VMs. If we take a look at the output of `mdevctl types` we see lots of different types that we can choose from. However, if we for example chose `GRID RTX6000-4Q` which gives us 4GB of vram in a VM, we are locked to that type for all of our VMs. Meaning we can only have 4GB VMs, its not possible to mix different types to have one 4GB VM, and two 2GB VMs.
> ### Important notes
>
> Q profiles *can* give you horrible performance in OpenGL applications/games. To fix that, switch to an equivalent A or B profile (for example `GRID RTX6000-4B`)
>
> C profiles (for example `GRID RTX6000-4C`) only work on Linux, don't try using those on Windows, it will not work - at all.
>
> A profiles (for example `GRID RTX6000-4A`) will NOT work on Linux, they only work on Windows.
All of that changes with the override config file. Technically we are still locked to only using one profile, but now its possible to change the vram of the profile on a VM basis so even though we have three `GRID RTX6000-4Q` instances, one VM can have 4GB or vram but we can override the vram size for the other two VMs to only 2GB. All of that changes with the override config file. Technically we are still locked to only using one profile, but now its possible to change the vram of the profile on a VM basis so even though we have three `GRID RTX6000-4Q` instances, one VM can have 4GB or vram but we can override the vram size for the other two VMs to only 2GB.
Lets take a look at this example config override file (its in TOML format) Lets take a look at this example config override file (its in TOML format)
@@ -422,24 +470,21 @@ display_height = 1080 # Maximum display height in the VM
max_pixels = 2073600 # This is the product of display_width and display_height so 1920 * 1080 = 2073600 max_pixels = 2073600 # This is the product of display_width and display_height so 1920 * 1080 = 2073600
cuda_enabled = 1 # Enables CUDA support. Either 1 or 0 for enabled/disabled cuda_enabled = 1 # Enables CUDA support. Either 1 or 0 for enabled/disabled
frl_enabled = 1 # This controls the frame rate limiter, if you enable it your fps in the VM get locked to 60fps. Either 1 or 0 for enabled/disabled frl_enabled = 1 # This controls the frame rate limiter, if you enable it your fps in the VM get locked to 60fps. Either 1 or 0 for enabled/disabled
framebuffer = 0x76000000 # VRAM size for the VM. In this case its 2GB framebuffer = 0x74000000
# Other options: framebuffer_reservation = 0xC000000 # In combination with the framebuffer size
# 1GB: 0x3B000000 # above, these two lines will give you a VM
# 2GB: 0x76000000 # with 2GB of VRAM (framebuffer + framebuffer_reservation = VRAM size in bytes).
# 3GB: 0xB1000000 # See below for some other sizes
# 4GB: 0xEC000000
# 8GB: 0x1D8000000
# 16GB: 0x3B0000000
# These numbers may not be accurate for you, but you can always calculate the right number like this:
# The amount of VRAM in your VM = `framebuffer` + `framebuffer_reservation`
[mdev.00000000-0000-0000-0000-000000000100] [vm.100]
frl_enabled = 0 frl_enabled = 0
# You can override all the options from above here too. If you want to add more overrides for a new VM, just copy this block and change the UUID # You can override all the options from above here too. If you want to add more overrides for a new VM, just copy this block and change the VM ID
``` ```
There are two blocks here, the first being `[profile.nvidia-259]` and the second `[mdev.00000000-0000-0000-0000-000000000100]`. There are two blocks here, the first being `[profile.nvidia-259]` and the second `[vm.100]`.
The first one applies the overrides to all VM instances of the `nvidia-259` type (thats `GRID RTX6000-4Q`) and the second one applies its overrides only to one specific VM, that one with the uuid `00000000-0000-0000-0000-000000000100`. The first one applies the overrides to all VM instances of the `nvidia-259` type (thats `GRID RTX6000-4Q`) and the second one applies its overrides only to one specific VM, that one with the proxmox VM ID `100`.
The proxmox VM ID is the same number that you see in the proxmox webinterface, next to the VM name.
You don't have to specify all parameters, only the ones you need/want. There are some more that I didn't mention here, you can find them by going through the source code of the `vgpu_unlock-rs` repo. You don't have to specify all parameters, only the ones you need/want. There are some more that I didn't mention here, you can find them by going through the source code of the `vgpu_unlock-rs` repo.
@@ -452,65 +497,91 @@ display_height = 1080
max_pixels = 2073600 max_pixels = 2073600
``` ```
### Spoofing your vGPU instance ### Common VRAM sizes
#### Note: This only works on Windows guests, don't bother trying on Linux. Here are some common framebuffer sizes that you might want to use:
You can very easily spoof your virtual GPU to a different card, so that you could install normal quadro drivers instead of the GRID drivers that require licensing. - 512MB:
```toml
framebuffer = 0x1A000000
framebuffer_reservation = 0x6000000
```
- 1GB:
```toml
framebuffer = 0x38000000
framebuffer_reservation = 0x8000000
```
- 2GB:
```toml
framebuffer = 0x74000000
framebuffer_reservation = 0xC000000
```
- 3GB:
```toml
framebuffer = 0xB0000000
framebuffer_reservation = 0x10000000
```
- 4GB:
```toml
framebuffer = 0xEC000000
framebuffer_reservation = 0x14000000
```
- 5GB:
```toml
framebuffer = 0x128000000
framebuffer_reservation = 0x18000000
```
- 6GB:
```toml
framebuffer = 0x164000000
framebuffer_reservation = 0x1C000000
```
- 8GB:
```toml
framebuffer = 0x1DC000000
framebuffer_reservation = 0x24000000
```
- 10GB:
```toml
framebuffer = 0x254000000
framebuffer_reservation = 0x2C000000
```
- 12GB:
```toml
framebuffer = 0x2CC000000
framebuffer_reservation = 0x34000000
```
- 16GB:
```toml
framebuffer = 0x3BC000000
framebuffer_reservation = 0x44000000
```
- 20GB:
```toml
framebuffer = 0x4AC000000
framebuffer_reservation = 0x54000000
```
- 24GB:
```toml
framebuffer = 0x59C000000
framebuffer_reservation = 0x64000000
```
- 32GB:
```toml
framebuffer = 0x77C000000
framebuffer_reservation = 0x84000000
```
- 48GB:
```toml
framebuffer = 0xB2D200000
framebuffer_reservation = 0xD2E00000
```
For that you just have to add two lines to the override config. In this example I'm spoofing my Turing based card to a normal RTX 6000 Quadro card: `framebuffer` and `framebuffer_reservation` will always equal the VRAM size in bytes when added together.
```toml
[profile.nvidia-259]
# insert all of your other overrides here too
pci_device_id = 0x1E30
pci_id = 0x1E3012BA
```
`pci_device_id` is the pci id from the card you want to spoof to. In my case its `0x1E30` which is the `Quadro RTX 6000/8000`.
`pci_id` can be split in two parts: `0x1E30 12BA`, the first part `0x1E30` has to be the same as `pci_device_id`. The second part is the subdevice id. In my case `12BA` means its a RTX 6000 card and not RTX 8000.
You can get the IDs from [here](https://pci-ids.ucw.cz/read/PC/10de/). Just Ctrl+F and search the card you want to spoof to, then copy the id it shows you on the left and use it for `pci_device_id`.
After doing that, click the same id, it should open a new page where it lists the subsystems. If there are none listed, you must use `0000` as the second value for `pci_id`. But if there are some, you have to select the one you want and use its id as the second value for `pci_id` (see above).
## Important note when spoofing
When I originally wrote this guide, the latest quadro drivers were from the R510 branch, but nvidia has since released multiple drivers in the R515 and R520 branch, those will **NOT WORK** and maybe even make your VM crash.
If you accidentally installed such a driver, its best to either remove the driver completely using DDU or just install a fresh windows VM.
The quadro driver for R510 branch can be found [here (for 512.78)](https://www.nvidia.com/Download/driverResults.aspx/189361/en-us/) or [here (for 513.46)](https://www.nvidia.com/download/driverResults.aspx/191342/en-us/). I've had the best results with 512.78 but the other could work too. But anything newer than that, will **NOT WORK**.
## Adding a vGPU to a Proxmox VM ## Adding a vGPU to a Proxmox VM
There is only one thing you have to do from the commandline: Open the VM config file and give the VM a uuid. Go to the proxmox webinterface, go to your VM, then to `Hardware`, then to `Add` and select `PCI Device`.
For that you need your VM ID, in this example I'm using `1000`.
```bash
nano /etc/pve/qemu-server/<VM-ID>.conf
```
So with the VM ID 1000, I have to do this:
```bash
nano /etc/pve/qemu-server/1000.conf
```
In that file, you have to add a new line at the end:
```
args: -uuid 00000000-0000-0000-0000-00000000XXXX
```
You have to replace `XXXX` with your VM ID. With my 1000 ID I have to use this line:
```
args: -uuid 00000000-0000-0000-0000-000000001000
```
Save and exit from the editor. Thats all you have to do from the terminal.
Now go to the proxmox webinterface, go to your VM, then to `Hardware`, then to `Add` and select `PCI Device`.
You should be able to choose from a list of pci devices. Choose your GPU there, its entry should say `Yes` in the `Mediated Devices` column. You should be able to choose from a list of pci devices. Choose your GPU there, its entry should say `Yes` in the `Mediated Devices` column.
Now you should be able to also select the `MDev Type`. Choose whatever profile you want, if you don't remember which one you want, you can see the list of all available types with `mdevctl types`. Now you should be able to also select the `MDev Type`. Choose whatever profile you want, if you don't remember which one you want, you can see the list of all available types with `mdevctl types`.
@@ -519,7 +590,41 @@ Finish by clicking `Add`, start the VM and install the required drivers. After i
Enjoy your new vGPU VM :) Enjoy your new vGPU VM :)
## Credits ## Licensing
Usually a license is required to use vGPU, but luckily the community found several ways around that. Spoofing the vGPU instance to a Quadro GPU used to be very popular, but I don't recommend it anymore. I've also removed the related sections from this guide. If you still want it for whatever reason, you can go back in the commit history to find the instructions on how to use that.
The recommended way to get around the license is to set up your own license server. Follow the instructions [here](https://git.collinwebdesigns.de/oscar.krause/fastapi-dls) (or [here](https://gitea.publichub.eu/oscar.krause/fastapi-dls) if the other link is down).
## Common problems
Most problems can be solved by reading the instructions very carefully. For some very common problems, read here:
- The nvidia driver won't install/load
- If you were using gpu passthrough before, revert **ALL** of the steps you did or start with a fresh proxmox installation. If you run `lspci -knnd 10de:` and see `vfio-pci` under `Kernel driver in use:` then you have to fix that
- Make sure that you are using a supported kernel version (check `uname -a`)
- My OpenGL performance is absolute garbage, what can I do?
- Read [here](#important-notes)
- `mdevctl types` doesn't output anything, how to fix it?
- Make sure that you don't have unlock disabled if you have a consumer gpu ([more information](#have-a-vgpu-supported-card-read-here))
- vGPU doesn't work on my RTX 3080! What to do?
- [Learn to read](#your-rtx-30xx-or-40xx-will-not-work-at-this-point-in-time)
- Make sure that you don't have any dummy plugs connected to the GPU ports, they may cause problems as [reported by a user from the vgpu discord](https://discord.com/channels/829786927829745685/1182258311014400040/1187339682082721822)
## Support
If something isn't working, please create an issue or join the [Discord server](https://discord.gg/5rQsSV3Byq) and ask for help in the `#proxmox-support` channel so that the community can help you.
> ### DO NOT SEND ME A DM, I'M NOT YOUR PERSONAL SUPPORT
When asking for help, please describe your problem in detail instead of just saying "vgpu doesn't work". Usually a rough overview over your system (gpu, mainboard, proxmox version, kernel version, ...) and full output of `dmesg` and/or `journalctl --no-pager -b 0 -u nvidia-vgpu-mgr.service` (<-- this only after starting the VM that causes trouble) is helpful.
Please also provide the output of `uname -a` and `cat /proc/cmdline`
## Feed my coffee addiction ☕
If you found this guide helpful and want to support me, please feel free to [buy me a coffee](https://www.buymeacoffee.com/polloloco). Thank you very much!
## Further reading
Thanks to all these people (in no particular order) for making this project possible Thanks to all these people (in no particular order) for making this project possible
- [DualCoder](https://github.com/DualCoder) for his original [vgpu_unlock](https://github.com/DualCoder/vgpu_unlock) repo with the kernel hooks - [DualCoder](https://github.com/DualCoder) for his original [vgpu_unlock](https://github.com/DualCoder/vgpu_unlock) repo with the kernel hooks
@@ -529,7 +634,7 @@ Thanks to all these people (in no particular order) for making this project poss
- [rupansh](https://github.com/rupansh) for the original [twelve.patch](https://github.com/rupansh/vgpu_unlock_5.12/blob/master/twelve.patch) to patch the driver on kernels >= 5.12 - [rupansh](https://github.com/rupansh) for the original [twelve.patch](https://github.com/rupansh/vgpu_unlock_5.12/blob/master/twelve.patch) to patch the driver on kernels >= 5.12
- mbuchel#1878 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for [fourteen.patch](https://gist.github.com/erin-allison/5f8acc33fa1ac2e4c0f77fdc5d0a3ed1) to patch the driver on kernels >= 5.14 - mbuchel#1878 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for [fourteen.patch](https://gist.github.com/erin-allison/5f8acc33fa1ac2e4c0f77fdc5d0a3ed1) to patch the driver on kernels >= 5.14
- [erin-allison](https://github.com/erin-allison) for the [nvidia-smi wrapper script](https://github.com/erin-allison/nvidia-merged-arch/blob/d2ce752cd38461b53b7e017612410a3348aa86e5/nvidia-smi) - [erin-allison](https://github.com/erin-allison) for the [nvidia-smi wrapper script](https://github.com/erin-allison/nvidia-merged-arch/blob/d2ce752cd38461b53b7e017612410a3348aa86e5/nvidia-smi)
- LIL'pingu#9069 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for his patch to nop out code that NVIDIA added to prevent usage of drivers with a version >= 460 with consumer cards - LIL'pingu#9069 on the [GPU Unlocking discord](https://discord.gg/5rQsSV3Byq) for his patch to nop out code that NVIDIA added to prevent usage of drivers with a version 460 - 470 with consumer cards
If I forgot to mention someone, please create an issue or let me know otherwise. If I forgot to mention someone, please create an issue or let me know otherwise.

BIN
downloading_driver.mp4 Normal file

Binary file not shown.

View File

@@ -1,12 +0,0 @@
#!/usr/bin/bash
for a in $*
do
case $a in
vgpu)
export LD_PRELOAD="/opt/vgpu_unlock-rs/target/release/libvgpu_unlock_rs.so"
;;
esac
done
exec /usr/bin/nvidia-smi $@