JupyterLab Installation and Configuration

You can use JupyterLab to access HeavyDB.

Installing Jupyter with HeavyDB

When HEAVY.AI is running in Docker, the paths used must be relative paths accessible to Docker, instead of OS-level paths. Keep the following in mind when performing Docker-based installations:

  • You must locate your docker-compose.yml file in a location that is reachable from Docker.

  • The ingest and export paths from HEAVY.AI are likely different from the actual location of the file because HEAVY.AI uses container paths instead of operating system paths.

  1. Install the NVIDIA drivers and nvidia-container-runtime for your operating system, using the instructions at https://github.com/NVIDIA/nvidia-container-runtime.

  2. For Apt-based installations, such as Ubuntu, use the Docker preparation instructions.

  3. Change the default Docker runtime to nvidia and restart Docker.

    a. Edit /etc/docker/daemon.jsonand add "default-runtime": "nvidia". The resulting file should look similar to the following:

    {
        "default-runtime": "nvidia",
        "runtimes": {
          "nvidia": {
            "path": "nvidia-container-runtime",
            "runtimeArgs": []
          }
        }
    }

    b. Restart the Docker service:

    sudo systemctl restart docker

    c. Validate NVIDIA Docker:

    docker run --rm nvidia/cudagl:11.0-runtime-ubuntu18.04 nvidia-smi
  4. Create an HEAVY.AI storage directory with a name of your choosing and change directories to it.

    sudo mkdir /var/lib/heavyai
    cd /var/lib/heavyai
  5. Create the directory /var/lib/heavyai/jupyter/.

    sudo mkdir /var/lib/heavyai/jupyter
  6. Create the file /var/lib/heavyai/heavyai.conf, and configure the jupyter-url setting under the [web] section to point to the Jupyter service:

  7. [web]
    jupyter-url = "http://jupyterhub:8000"
    servers-json = "/heavyai-storage/servers.json"
  8. Create the file /var/lib/heavyai/data/heavyai.license. Copy your license key from the registration email message. If you have not received your license key, contact your Sales Representative or register for your 30-day trial here.

  9. Create the following /var/lib/heavyai/servers.json entry to enable Jupyter features in Immerse.

    [
      {
        "enableJupyter": true
      }
    ]
  10. Create /var/lib/heavyai/docker-compose.yml.

11. Run docker-compose pull to download the images from Docker Hub. Optional when you specify a version in docker-compose.yml.

12. Make sure you are in the directory that you created in step 5, and run compose in detached mode:

compose restarts by default whenever stopped.

13. Log in as the super user (admin/HyperInteractive).

14. Create required users in HeavyDB.

15. Create the heavyai_jupyter role in HeavyDB:

16. Grant the heavyai_jupyter role to users who require Jupyter access:

You might need to stop lab containers for them to be restarted with the new image:

You should now see Jupyter icons in the upper right of Immerse and when running queries in SQL Editor.

Adding Jupyter to an Existing HeavyDB Instance

When HEAVY.AI is running in Docker, the paths used must be relative paths accessible to Docker, intead of OS-level paths. Keep the following in mind when peforming Docker-based installations:

  • You must locate your docker-compose.yml file in a location that is reachable from Docker.

  • The ingest and export paths from HEAVY.AI are likely different from the actual location of the file because HEAVY.AI uses container paths instead of operating system paths.

To use Jupyter with an existing, non-Docker install of HEAVY.AI, change HEAVY.AI to run on Docker instead of the host. The steps are the same as the install instructions, with the following exceptions:

  1. Change the volume mappings to point to your existing installation path:

  2. Enable the following environment variables and change the relevant paths to your existing installation:

  3. If you have an existing heavyai.conf file:

    • Add the required sections instead of creating a new file:

    • Remove the data, port, http-port, and frontend properties, which should not be changed with a Docker installation.

    • Ensure that all paths, such as cert and key, are accessible by Docker.

  4. If you have an existing servers.json file, move it to your HEAVY.AI home directory (/var/lib/heavyai by default) and add the following key/value pair:

Before running docker-compose up -d, ensure that any existing installations are stopped and disabled. For example:

As with any software upgrade, back up your data before you upgrade HEAVY.AI.

Creating the jhub_heavyai_dropbox Directory

Run the following commands to create the jhub_heavai_dropbox directory and make it writeable by your users. Change the volume mappings to point to your existing installation path:

This allows Jupyter users to write files into /home/jovyan/jhub_heavai_dropbox/. You can also use that directory path in COPY FROM SQL commands.

Upgrading docker-compose Services

To upgrade Jupyter images using the docker-compose.yml file, edit docker-compose.yml as follows:

Then, use the following commands to download the images and restart the services with the new versions:

You might also need to stop lab containers in to start them again in a new image:

For each user, run the following command:

Using Jupyter

Open JupyterLab by clicking the Jupyter icon in the upper right corner of Immerse.

JupyterLab opens in a new tab. You are signed in automatically using HEAVY.AI authentication, with a notebook ready to start an HEAVY.AI connection. The notebook is saved for you at the root of your Jupyter file system.

You can verify the location of your file by clicking the folder icon in the top left.

The contents of the cell are prefilled with explanatory comments and the heavyai_connect() method ready to set up your HEAVY.AI connection in Jupyter. Click the Play button to run the connection statement and list the tables in your HeavyDB instance.

You can continue to use con to run more Ibis expressions in other cells.

The connection reuses the session already in use by Heavy Immerse by passing Jupyter the raw session ID. If you connect this way without credentials, the connection has a time-to-live (TTL). After the session timeout period passes with no activity (60 minutes by default), the session is invalidated. You have to reenter Jupyter from Immerse in the same way to reestablish the connection, or use the heavyai connect method to enter your credentials manually.

You can also launch Jupyter from the Heavy Immerse SQL Editor. After you run a query in the SQL Editor, you see a button that allows you to send your query to Jupyter.

The query displays in a different notebook, ready to run the query. You must run the cell yourself to send the query and see the results.

Last updated