Install NVIDIA Drivers and Vulkan on Rocky Linux and RHEL

Install Prerequisites

Install the Extra Packages for Enterprise Linux (EPEL) repository and other packages before installing NVIDIA drivers.

sudo dnf -y install https://dl.fedoraproject.org/pub/epel/epel-release-latest-8.noarch.rpm

RHEL-based distributions require Dynamic Kernel Module Support (DKMS) to build the GPU driver kernel modules. For more information, see https://fedoraproject.org/wiki/EPEL. Upgrade the kernel and restart the machine.

sudo dnf -y upgrade kernel
sudo reboot now

Install Kernel Headers

Install kernel headers and development packages:

sudo dnf -y install kernel-devel-$(uname -r) kernel-headers-$(uname -r)

If installing kernel headers does not work correctly, follow these steps instead:

Identify the Linux kernel you are using by issuing the uname -r command.
Use the name of the kernel (4.18.0-553.el8_10.x86_64 in the following code example) to install kernel headers and development packages:

sudo dnf -y install \
kernel-devel-4.18.0-553.el8_10.x86_64 \
kernel-headers-4.18.0-553.el8_10.x86_64

Install the dependencies and extra packages:

Sudo dnf install -y kernel-devel kernel-headers pciutils dkms

Install NVIDIA Drivers and Vulkan

CUDA is a parallel computing platform and application programming interface (API) model. It uses a CUDA-enabled graphics processing unit (GPU) for general-purpose processing. The CUDA platform provides direct access to the GPU virtual instruction set and parallel computation elements. For more information on CUDA unrelated to installing HEAVY.AI, see https://developer.nvidia.com/cuda-zone. You can install drivers in multiple ways. This section provides installation information using the NVIDIA website or using dnf.

Although using the NVIDIA website is more time-consuming and less automated, you are assured that the driver is certified for your GPU. Use this method if you are not sure which driver to install. If you prefer a more automated method and are confident that the driver is certified, you can use the DNF package manager method.

Install NVIDIA Drivers Using the NVIDIA Website

Install the CUDA package for your platform and operating system according to the instructions on the NVIDIA website (https://developer.nvidia.com/cuda-downloads).

If you do not know the GPU model installed on your system, run this command:

lspci -v | egrep "3D|VGA*.NVIDIA" | awk -F '\[|\]' ' { print $2 } '

The output shows the product type, series, and model. In this example, the product type is Tesla, the series is T (as Turing), and the model is T4.

Tesla T4

Select the product type shown after running the command above.
Select the correct product series and model for your installation.
In the Operating System dropdown list, select Linux 64-bit.
In the CUDA Toolkit dropdown list, click a supported version (11.4 or higher).
Click Search.
On the resulting page, verify the download information and click Download.

Please check that the driver's version you download meets the HEAVI.AI minimum requirements.

Move the downloaded file to the server, change the permissions, and run the installation.

chmod +x NVIDIA-Linux-x86_64-*.run
sudo ./NVIDIA-Linux--x86_64-*.runYou might get the following error during installation:

You might receive the following error during installation:

ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.

If you receive this error, blacklist the Nouveau driver by editing the /etc/modprobe.d/blacklist-nouveau.conffile, adding the following lines at the end:

blacklist nouveau blacklist lbm-nouveau options nouveau modeset=0 alias nouveau off alias lbm-nouveau off

Install NVIDIA Drivers Using DNF

Install a specific version of the driver for your GPU by installing the NVIDIA repository and using the DNF package manager.

When installing the driver, ensure your GPU model is supported and meets the HEAVI.AI minimum requirements.

Add the NVIDIA network repository to your system.

sudo dnf config-manager --add-repo \
http://developer.download.nvidia.com/compute/cuda/repos/rhel8/$(uname -i)/cuda-rhel8.repo

Install the driver version needed with dnf. For 8.0, the minimum version is 535.

sudo dnf -y module install nvidia-driver:535-dkms

To load the installed driver, you can run sudo modprobe nvidia or nvidia-smi commands, or , in case of driver upgrade, you can reboot your system to ensure that the new version of the driver is loaded using the command sudo reboot

Check NVIDIA Driver Installation

Run the specified command to verify that your drivers are installed correctly and recognize the GPUs in your environment. Depending on your environment, you should see output confirming the presence of your NVIDIA GPUs and drivers. This verification step ensures that your system can identify and utilize the GPUs as intended.

If you encounter an error similar to the following, the NVIDIA drivers are likely installed incorrectly: NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Please ensure that the latest NVIDIA driver is installed and running.

Please review the Install NVIDIA Drivers section and correct any errors.

Install Vulkan

The back-end renderer requires a Vulkan-enabled driver and the Vulkan library to work correctly. Without these components, the database cannot start without disabling the back-end renderer.

To ensure the Vulkan library and its dependencies are installed, use the DNF.

sudo dnf -y install vulkan

For more information about troubleshooting Vulkan, see the Vulkan Renderer section.

Install CUDA Toolkit ᴼᴾᵀᴵᴼᴺᴬᴸ

You must install the CUDA Toolkit if you use advanced features like C++ User-Defined Functions or User-Defined Table Functions to extend the database capabilities.

Add the NVIDIA network repository to your system:

sudo dnf config-manager --add-repo \
http://developer.download.nvidia.com/compute/cuda/repos/rhel8/$(uname -i)/cuda-rhel8.repo

2. List the available CUDA Toolkit versions using the DNF list command

dnf list cuda-toolkit-* | egrep -v config

Available Packages
cuda-toolkit-10-1.x86_64                     10.1.243-1        cuda-rhel8-x86_64
cuda-toolkit-10-2.x86_64                     10.2.89-1         cuda-rhel8-x86_64
cuda-toolkit-11-0.x86_64                     11.0.3-1          cuda-rhel8-x86_64
cuda-toolkit-11-1.x86_64                     11.1.1-1          cuda-rhel8-x86_64
cuda-toolkit-11-2.x86_64                     11.2.2-1          cuda-rhel8-x86_64
cuda-toolkit-11-3.x86_64                     11.3.1-1          cuda-rhel8-x86_64
cuda-toolkit-11-4.x86_64                     11.4.4-1          cuda-rhel8-x86_64
cuda-toolkit-11-5.x86_64                     11.5.2-1          cuda-rhel8-x86_64
cuda-toolkit-11-6.x86_64                     11.6.2-1          cuda-rhel8-x86_64
cuda-toolkit-11-7.x86_64                     11.7.1-1          cuda-rhel8-x86_64
cuda-toolkit-11-8.x86_64                     11.8.0-1          cuda-rhel8-x86_64
cuda-toolkit-12.x86_64                       12.5.0-1          cuda-rhel8-x86_64
cuda-toolkit-12-0.x86_64                     12.0.1-1          cuda-rhel8-x86_64
cuda-toolkit-12-1.x86_64                     12.1.1-1          cuda-rhel8-x86_64
cuda-toolkit-12-2.x86_64                     12.2.2-1          cuda-rhel8-x86_64
cuda-toolkit-12-3.x86_64                     12.3.2-1          cuda-rhel8-x86_64
cuda-toolkit-12-4.x86_64                     12.4.1-1          cuda-rhel8-x86_64
cuda-toolkit-12-5.x86_64                     12.5.0-1          cuda-rhel8-x86_64

3. Install the CUDA Toolkit version using DNF.

sudo dnf -y install cuda-toolkit-<version>.x86_64

4. Check that everything is working correctly:

nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2020 NVIDIA Corporation
Built on Mon_Nov_30_19:08:53_PST_2020
Cuda compilation tools, release 11.2, V11.2.67
Build cuda_11.2.r11.2/compiler.29373293_0

PreviousHEAVY.AI Installation on RHEL NextInstalling on Ubuntu

Last updated 1 year ago