Install the Extra Packages for Enterprise Linux (EPEL) repository and other packages before installing NVIDIA drivers.
For CentOS, use yum to install the epel-release
package.
Use the following install command for RHEL.
RHEL-based distributions require Dynamic Kernel Module Support (DKMS) to build the GPU driver kernel modules. For more information, see https://fedoraproject.org/wiki/EPEL. Upgrade the kernel and restart the machine.
Install kernel headers and development packages:
If installing kernel headers does not work correctly, follow these steps instead:
Identify the Linux kernel you are using by issuing the uname -r
command.
Use the name of the kernel (3.10.0-862.11.6.el7.x86_64
in the following code example) to install kernel headers and development packages:
Install the dependencies and extra packages:
CUDA is a parallel computing platform and application programming interface (API) model. It uses a CUDA-enabled graphics processing unit (GPU) for general-purpose processing. The CUDA platform provides direct access to the GPU virtual instruction set and parallel computation elements. For more information on CUDA unrelated to installing HEAVY.AI, see https://developer.nvidia.com/cuda-zone. You can install drivers in multiple ways. This section provides installation information using the NVIDIA website or using yum.
Although using the NVIDIA website is more time consuming and less automated, you are assured that the driver is certified for your GPU. Use this method if you are not sure which driver to install. If you prefer a more automated method and are confident that the driver is certified, you can use the package-manager method.
Install the CUDA package for your platform and operating system according to the instructions on the NVIDIA website (https://developer.nvidia.com/cuda-downloads).
If you do not know the GPU model installed on your system, run this command:
The output shows the product type, series, and model. In this example, the product type is Tesla, the series is T (as Turing), and the model is T4.
Select the product type shown after running the command above.
Select the correct product series and model for your installation.
In the Operating System dropdown list, select Linux 64-bit.
In the CUDA Toolkit dropdown list, click a supported version (11.4 or higher).
Click Search.
On the resulting page, verify the download information and click Download.
Please check that the driver's version you are downloading meets the HEAVI.AI minimum requirements.
Move the downloaded file to the server, change the permissions, and run the installation.
You might receive the following error during installation:
ERROR: The Nouveau kernel driver is currently in use by your system. This driver is incompatible with the NVIDIA driver, and must be disabled before proceeding. Please consult the NVIDIA driver README and your Linux distribution's documentation for details on how to correctly disable the Nouveau kernel driver.
If you receive this error, blacklist the Nouveau driver by editing the /etc/modprobe.d/blacklist-nouveau.conf
file, adding the following lines at the end:
blacklist nouveau
blacklist lbm-nouveau
options nouveau modeset=0
alias nouveau off
alias lbm-nouveau off
Install a specific version of the driver for your GPU by installing the NVIDIA repository and using the yum
package manager.
When installing the driver, ensure that your GPU model is supported and meets the HEAVI.AI minimum requirements.
Add the NVIDIA network repository to your system.
Run the available drivers for the download.
Install the driver version needed with yum
.
Reboot your system to ensure that the new version of the driver is loaded.
Run nvidia-smi
to verify that your drivers are installed correctly and recognize the GPUs in your environment. Depending on your environment, you should see something like this to confirm that your NVIDIA GPUs and drivers are present:
If you see an error like the following, the NVIDIA drivers are probably installed incorrectly:
Review the Install NVIDIA Drivers section and correct any errors.
To work correctly, the back-end renderer requires a Vulkan-enabled driver and the Vulkan library. Without these components, the database cannot start without disabling the back-end renderer.
Install the Vulkan library and its dependencies using yum
both CentOS and RHEL.
If installing on RHEL, you must obtain and manually install the vulkan-filesystem package manually. Perform these additional steps:
Download the rpm file
Install the rpm file
You might see a warning similar to the following:
Ignore it now; you can verify NVIDIA driver installation here.
For more information about troubleshooting Vulkan, see the Vulkan Renderer section.
You must install the CUDA Toolkit if you use advanced features like C++ User-Defined Functions or User-Defined Table Functions to extend the database capabilities.
Add the NVIDIA network repository to your system:
2. List the available CUDA Toolkit versions:
3. Install the CUDA Toolkit using yum
:
4. Check that everything is working correctly: