Search…
Configuration Parameters
HEAVY.AI has minimal configuration requirements with a number of additional configuration options. This topic describes the required and optional configuration changes you can use in your HEAVY.AI instance.
In release 4.5.0 and higher, HEAVY.AI requires that all configuration flags used at startup match a flag on the HEAVY.AI server. If any flag is misspelled or invalid, the server does not start. This helps ensure that all settings are intentional and will not have unexpected impact on performance or data integrity.

Data Directory

Before starting the HEAVY.AI server, you must initialize the persistent data directory. To do so, create an empty directory at the desired path, such as /var/lib/heavyai. Create the environment variable $HEAVYAI_STORAGE.
1
export HEAVYAI_STORAGE=/var/lib/heavyai
Copied!
Change the owner of the directory to the user that the server will run as ($HEAVYAI_USER):
1
sudo mkdir -p $HEAVYAI_STORAGE
2
sudo chown -R $HEAVYAI_USER $HEAVYAI_STORAGE
Copied!
Where $HEAVYAI_USER is the system user account that the server runs as, such as heavyai, and $HEAVYAI_STORAGE is the path to the parent of the HEAVY.AI server data directory.
Finally, run $HEAVYAI_PATH/bin/initheavy with the data directory path as the argument:
1
$HEAVYAI_PATH/bin/initheavy $HEAVYAI_STORAGE
Copied!

Configuring a Custom Heavy Immerse Subdirectory

Immerse serves the application from the root path (/) by default. To serve the application from a sub-path, you must modify the $HEAVYAI_PATH/frontend/app-config.js file to change the IMMERSE_PATH_PREFIX value. The Heavy Immerse path must start with a forward slash (/).

Configuration File

The configuration file stores runtime options for your HEAVY.AI servers. You can use the file to change default behavior.
The heavy.conf file is stored in the $HEAVYAI_STORAGE directory. The configuration settings are picked up automatically by the sudo systemctl start heavydb and sudo systemctl start heavy_web_server commands.
Set the flags in the configuration file using the format <flag> = <value>. StringHs must be enclosed in quotes.
The following is a sample configuration file. The entry for data path is a string and must be in quotes. The last entry in the first section, for null-div-by-zero, is the Boolean value true and does not require quotes.
1
port = 6274
2
http-port = 6278
3
data = "/var/lib/heavyai/data"
4
null-div-by-zero = true
5
6
[web]
7
port = 6273
8
frontend = "/opt/heavyai/frontend"
9
servers-json = "/var/lib/heavyai/servers.json"
10
enable-https = true
Copied!
To comment out a line in heavy.conf, prepend the line with the pound sign (#) character.
For encrypted backend connections, if you do not use a configuration file to start the database, Calcite expects passwords to be supplied through the command line, and calcite passwords will be visible in the processes table. If a configuration file is supplied, then passwords must be supplied in the file. If they are not, Calcite will fail.

Configuration Parameters for HeavyDB

Following are the parameters for runtime settings on HeavyDB. The parameter syntax provides both the implied value and the default value as appropriate. Optional arguments are in square brackets, while implied and default values are in parentheses.
For example, consider allow-loop-joins [=arg(=1)] (=0).
  • If you do not use this flag, loop joins are not allowed by default.
  • If you provide no arguments, the implied value is 1 (true) (allow-loop-joins).
  • If you provide the argument 0, that is the same as the default (allow-loop-joins=0).
  • If you provide the argument 1, that is the same as the implied value (allow-loop-joins=1).
Flag
Description
Default Value
allow-cpu-retry [=arg]
Allow the queries that failed on GPU to retry on CPU, even when watchdog is enabled. When watchdog is enabled, most queries that run on GPU and throw a watchdog exception fail. Turn this on to allow queries that fail the watchdog on GPU to retry on CPU. The default behavior is for queries that run out of memory on GPU to throw an error if watchdog is enabled. Watchdog is enabled by default.
TRUE[1]
allow-local-auth-fallback [=arg(=1)] (=0)
If SAML or LDAP logins are enabled, and the logins fail, this setting enables authentication based on internally stored login credentials. Command-line tools or other tools that do not support SAML might reject those users from logging in unless this feature is enabled. This allows a user to log in using credentials on the local database.
FALSE[0]
allow-loop-joins [=arg(=1)] (=0)
Enables all join queries to fall back to the loop join implementation. During a loop join, queries loop over all rows from all tables involved in the join, and evaluate the join condition. By default, loop joins are only allowed if the number of rows in the inner table is fewer than the trivial-loop-join-threshold, since loop joins are computationally expensive and run for an extended period. Modifying the trivial-loop-join-threshold is a safer alternative to globally enabling loop joins. You might choose to globally enable loop joins when you have many small tables for which loop join performance has been determined to be acceptable but modifying the trivial join loop threshold would be tedious.
FALSE[0]
allowed-export-paths = ["root_path_1", root_path_2", ...]
Specify a list of allowed root paths that can be used in export operations, such as the COPY TO command. Helps prevent exploitation of security vulnerabilities and prevent server crashes, data breaches, and full remote control of the host machine. For example:
allowed-export-paths = ["/heavyai-storage/data/heavyai_export", "/home/centos"] The list of paths must be on the same line as the configuration parameter.
Allowed file paths are enforced by default. The default export path (<data directory>/heavyai_export) is allowed by default, and all child paths of that path are allowed.
When using commands with other paths, the provided paths must be under an allowed root path. If you try to use a nonallowed path in a COPY TO command, an error response is returned.
N/A
allowed-import-paths = ["root_path_1", "root_path_2", ...]
Specify a list of allowed root paths that can be used in import operations, such as the COPY FROM command. Helps prevent exploitation of security vulnerabilities and prevent server crashes, data breaches, and full remote control of the host machine.
For example:
allowed-import-paths = ["/heavyai-storage/data/heavyai_import", "/home/centos"] The list of paths must be on the same line as the configuration parameter.
Allowed file paths are enforced by default. The default import path (<data directory>/heavyai_import) is allowed by default, and all child paths of that allowed path are allowed.
When using commands with other paths, the provided paths must be under an allowed root path. If you try to use a nonallowed path in a COPY FROM command, an error response is returned.
N/A
approx_quantile_buffer arg
Size of a temporary buffer that is used to copy in the data for APPROX_MEDIAN calculation. When full, is sorted before being merged into the internal distribution buffer configured in approx_quantile_centroids.
1000
approx_quantile_centroids arg
Size of the internal buffer used to approximate the distribution of the data for which the APPOX_MEDIAN calculation is taken. The larger the value, the greater the accuracy of the answer.
300
auth-cookie-name arg
Configure the authentication cookie name. If not explicitly set, the default name is oat.
oat
bigint-count [=arg]
Use 64-bit count. Disabled by default because 64-bit integer atomics are slow on GPUs. Enable this setting if you see negative values for a count, indicating overflow. In addition, if your data set has more than 4 billion records, you likely need to enable this setting.
FALSE[0]
bitmap-memory-limitarg
Set the maximum amount of memory (in GB) allocated for APPROX_COUNT_DISTINCT bitmaps per execution kernel (thread or GPU).
8
calcite-max-mem arg
Max memory available to calcite JVM. Change if Calcite reports out-of-memory errors.
1024
calcite-port arg
Calcite port number. Change to avoid collisions with ports already in use.
6279
calcite-service-timeout
Service timeout value, in milliseconds, for communications with Calcite. On databases with large numbers of tables, large numbers of concurrent queries, or many parallel updates and deletes, Calcite might return less quickly. Increasing the timeout value can prevent THRIFT_EAGAIN timeout errors.
5000
columnar-large-projections[=arg]
Sets automatic use of columnar output, instead of row-wise output, for large projections.
TRUE
columnar-large-projections-threshold arg
Set the row-number threshold size for columnar output instead of row-wise output.
1000000
config arg
Path to heavy.conf. Change for testing and debugging.
$HEAVYAI_STORAGE/ heavy.conf
cpu-only
Run in CPU-only mode. Set this flag to force HeavyDB to run in CPU mode, even when GPUs are available. Useful for debugging and on shared-tenancy systems where the current HeavyDB instance does not need to run on GPUs.
FALSE
cpu-buffer- mem-bytes arg
Size of memory reserved for CPU buffers [bytes]. Change to restrict the amount of CPU/system memory HeavyDB can consume. A default value of 0 indicates no limit on CPU memory use. (HEAVY.AI Server uses all available CPU memory on the system.)
0
cuda-block-size arg
Size of block to use on GPU. GPU performance tuning: Number of threads per block. Default of 0 means use all threads per block.
0

Additional Enterprise Edition Parameters

Flag
Description
Default Value
cluster arg
Path to data leaves list JSON file. Indicates that the HEAVY.AI server instance is an aggregator node, and where to find the rest of its cluster. Change for testing and debugging.
$HEAVYAI_STORAGE
compression-limit-bytes [=arg(=536870912)] (=536870912)
Compress result sets that are transferred between leaves. Minimum length of payload above which data is compressed.
536870912
compressor arg (=lz4hc)
Compressor algorithm to be used by the server to compress data being transferred between server. See Data Compression for compression algorithm options.
lz4hc
ldap-dn arg
LDAP Distinguished Name.
ldap-role-query-regex arg
RegEx to use to extract role from role query result.
ldap-role-query-url arg
LDAP query role URL.
ldap-superuser-role arg
The role name to identify a superuser.
ldap-uri arg
LDAP server URI.
leaf-conn-timeout [=arg]
Leaf connect timeout, in milliseconds. Increase or decrease to fail Thrift connections between HeavyDB instances more or less quickly if a connection cannot be established.
20000
leaf-recv-timeout [=arg]
Leaf receive timeout, in milliseconds. Increase or decrease to fail Thrift connections between HeavyDB instances more or less quickly if data is not received in the time allotted.
300000
leaf-send-timeout [=arg]
Leaf send timeout, in milliseconds. Increase or decrease to fail Thrift connections between HeavyDB instances more or less quickly if data is not sent in the time allotted.
300000
saml-metadata-file arg
Path to identity provider metadata file.
Required for running SAML. An identity provider (like Okta) supplies a metadata file. From this file, HEAVY.AI uses:
  1. 1.
    Public key of the identity provider to verify that the SAML response comes from it and not from somewhere else.
  2. 2.
    URL of the SSO login page used to obtain a SAML token.
saml-sp-target-url arg
URL of the service provider for which SAML assertions should be generated. Required for running SAML. Used to verify that a SAML token was issued for HEAVY.AI and not for some other service.
saml-sync-roles arg (=0)
Enable mapping of SAML groups to HEAVY.AI roles. The SAML Identity provider (for example, Okta) automatically creates users at login and assigns them roles they already have as groups in SAML.
saml-sync-roles [=0]
string-servers arg
Path to string servers list JSON file. Indicates that HeavyDB is running in distributed mode and is required to designate a leaf server when running in distributed mode.

Configuration Parameters for HEAVY.AI Web Server

Flag
Description
Default
allow-any-origin
Allows for a CORS exception to the same-origin policy. Required to be true if Immerse is hosted on a different domain or subdomain hosting heavy_web_server and heavydb.
Allowing any origin is a less secure mode than what heavy_web_server requires by default.
--allow-any-origin = false
-b | backend-url string
URL to http-port on heavydb. Change to avoid collisions with other services.
http://localhost:6278
cert string
Certificate file for HTTPS. Change for testing and debugging.
cert.pem
-c | config string
Path to HEAVY.AI configuration file. Change for testing and debugging.
-d | data string
Path to HEAVY.AI data directory. Change for testing and debugging.
data
db-query-list <path-to-query-list-file>
Preload data to memory based on SQL queries stored in a list file. Automatically run queries that load the most frequently used data to enhance performance. See Pre-loading Data.
n/a
docs string
Path to documentation directory. Change if you move your documentation files to another directory.
docs
enable-browser-logs [=arg]
Enable access to current log files via web browser. Only super users (while logged in) can access log files.
Log files are available at http[s]://host:port/logs/log_name.
The web server log files: ACCESS - http[s]://host:port/logs/access ALL - http[s]://host:port/logs/all
HeavyDB log files: INFO - http[s]://host:port/logs/info WARNING - http[s]://host:port/logs/warning ERROR - http[s]://host:port/logs/
FALSE[0]
enable-cert-verification
TLS certificate verification is a security measure that can be disabled for the cases of TLS certificates not issued by a trusted certificate authority. If using a locally or unofficially generated TLS certificate to secure the connection between heavydb and heavy_web_server, this parameter must be set to false. heavy_web_server expects a trusted certificate authority by default.
--enable-cert-verification = true
enable-cross-domain [=arg]
Enable frontend cross-domain authentication. Cross-domain session cookies require the SameSite = None; Secure headers. Can only be used with HTTPS domains; requires enable-https to be true.
FALSE[0]
enable-https
Enable HTTPS support. Change to enable secure HTTP.
enable-https-redirect [=arg]
Enable a new port that heavy_web_server listens on for incoming HTTP requests. When received, it returns a redirect response to the HTTPS port and protocol, so that browsers are immediately and transparently redirected. Use to provide an HEAVY.AI front end that can run on both the HTTP protocol (http://my-heavyai-frontend.com) on default HTTP port 80, and on the primary HTTPS protocol (https://my-heavyai-frontend.com) on default https port 443, and have requests to the HTTP protocol automatically redirected to HTTPS. Without this, requests to HTTP fail. Assuming heavy_web_server can attach to ports below 1024, the configuration would be: enable-https-redirect = TRUE http-to-https-redirect-port = 80
FALSE[0]
-f | frontend string
Path to frontend directory. Change if you move the location of your frontend UI files.
frontend
http-to-https-redirect-port = arg
Configures the http (incoming) port used by enable-https-redirect. The port option specifies the redirect port number. Use to provide an HEAVY.AI front end that can run on both the HTTP protocol (http://my-heavyai-frontend.com) on default HTTP port 80, and on the primary HTTPS protocol (https://my-heavyai-frontend.com) on default https port 443, and have requests to the HTTP protocol automatically redirected to HTTPS. Without this, requests to HTTP fail. Assuming heavy_web_server can attach to ports below 1024, the configuration would be: enable-https-redirect = TRUE http-to-https-redirect-port = 80
6280
jwt-key-file
Path to a key file for client session encryption.
The file is expected to be a PEM-formatted ( .pem ) certificate file containing the unencrypted private key in PKCS #1, PCKS #8, or ASN.1 DER form.
Example PEM file creation using OpenSSL.
Required only if using a high-availability server configuration or another server configuration that requires an instance of Immerse to talk to multiple heavy_web_server instances.
Each heavy_web_server instance needs to use the same encryption key to encrypt and decrypt client session information which is used for session persistence ("sessionization") in Immerse.
key string
Key file for HTTPS. Change for testing and debugging.
key.pem
max-tls-version
Refers to the version of TLS encryption used to secure web protocol connections. Specifies a maximum TLS version.
min-tls-version
Refers to the version of TLS encryption used to secure web protocol connections. Specifies a minimum TLS version.
--min-tls-version = VersionTLS12
-p | port int
Frontend server port. Change to avoid collisions with other services.
6273