Using GPUs with Virtual Machines on vSphere – Part 3: Installing the NVIDIA Virtual GPU Technology

This is part 3 of a series of blog articles on the subject of using GPUs with VMware vSphere.

Part 1 of this series presents an overview of the various options for using GPUs on vSphere

Part 2 describes the DirectPath I/O (Passthrough) mechanism for GPUs

Part 3 gives details on setting up the NVIDIA Virtual GPU (vGPU) technology for GPUs on vSphere

Part 4 explores the setup for the Bitfusion Flexdirect method of using GPUs

In this article, we describe the NVIDIA vGPU (formerly “Grid”) method for using GPU devices on vSphere. The focus in this blog is on the use of GPUs for compute workloads (such as for machine learning, deep learning and high performance computing applications) and we are not looking at GPU usage for virtual desktop infrastructure (VDI) here.

The method of GPU usage on vSphere described here makes use of the products within the NVIDIA vGPU family. This vGPU family includes the “NVIDIA Virtual ComputeServer”, (VCS) and the “NVIDIA Quadro Virtual Datacenter Workstation” (vDWS) products for GPU access and management on vSphere, as well as other products. Here, we use the term “NVIDIA vGPU” as a synonym for the software product you choose from the vGPU family of products. NVIDIA recommends the vCS software product for machine learning and AI workloads, whereas vDWS was used for that purpose before vCS appeared on the market. These are licensed software products from NVIDIA.

Figure 1: Parts of the NVIDIA vGPU product shown in the ESXi Hypervisor and in virtual machines

Figure 1 shows the relationship of the parts of the NVIDIA vGPU product to each other in the overall vSphere and virtual machine architecture.

The NVIDIA vGPU software includes two separate components:

The NVIDIA Virtual GPU Manager, that is loaded as a VMware Installation Bundle (VIB) into the vSphere ESXi hypervisor itself and
A separate guest OS NVIDIA vGPU driver that is installed within the guest operating system of your virtual machine (the “guest VM driver”).

Using the NVIDIA vGPU technology with vSphere allows you to choose between dedicating a full GPU device to one virtual machine or to allow partial sharing of a GPU device by more than one virtual machine.

The reasons for choosing this NVIDIAvGPU option are

we know that the applications in your VMs do not need the power of full GPU;
there is a limited number of GPU devices and we want them to be available to more than one team of users simultaneously;
we sometimes want to dedicate a full GPU device to one VM, but at other times allow partial use of a GPU to a VM.

The released versions of the NVIDIA vGPU Manager and guest VM drivers that you install must be compatible. For all the versions of the software, versions of vSphere and the hardware versions, consult the current NVIDIA Release Notes. At the time of this writing, the NVIDIA vGPU release notes are located here.

1.NVIDIA vGPU Setup on the vSphere Host Server

The vSphere ESXi host server-specific part of the setup process is described first here. In order to set up the NVIDIA vGPU environment you will need:

The licensed NVIDIA vGPU product (including the VIB for vSphere and the guest OS driver)
Administrator login access to the console of your vSphere/ESXi machine

VMware recommends that you choose vSphere version 6.7 for this work. Choosing vSphere 6.7 update 1 will allow you to use the vMotion feature along with your GPU-enabled VM’s. If you choose to use vSphere 6.5 then ensure you are on update 1 before proceeding.

Carefully review the pre-requisites and other details in the NVIDIA vGPU Software User Guide document

The host server part of the NVIDIA vGPU installation process makes use of a “vib install” technique that is used in vSphere for installing drivers into the ESXi hypervisor itself. For more information on using vSphere VIBs, you should check this material

The NVIDIA vGPU Manager is contained in the VIB package that is downloaded from NVIDIA’s website. The package can be found by searching for the NVIDIA Quadro Virtual DataCenter Workstation (or vDWS) products on the NVIDIA site.

To install the NVIDIA vGPU Manager software into the vSphere ESXi hypervisor follow the procedure below.

1.1 Set the GPU Device to vGPU Mode Using the vSphere Host Graphics Setting

A GPU card can be configured in one of two modes: vSGA (shared virtual graphics) and vGPU. The NVIDIA card should be configured with vGPU mode. This is specifically for use of the GPU in compute workloads, such as in machine learning or high performance computing applications.

Access the ESXi host server either using the ESXi shell or through SSH. You will need to enable SSH access using the ESXi management console as SSH is disabled by default.To enable vGPU mode on the ESXi host, use the command line to execute this command:

# esxcli graphics host set –-default-type SharedPassthru

You may also get to this setting through the vSphere Client by choosing your host server and using the navigation

“Configure -> Hardware -> Graphics -> Host Graphics tab -> Edit”

A server reboot is required once the setting has been changed. The settings should appear as shown in Figure 2 below.

Figure 2: Edit Host Graphics screen for a Host Server in the vSphere Client

1.2 Check the Host Graphics Settings

To check that the settings have taken using the command line, type

# esxcli graphics host get

This command should produce output as follows: