Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Use INSERT for both single- and multi-row ad hoc inserts. (When inserting many rows, use the more efficient COPY command.)
You can also insert into a table as SELECT, as shown in the following examples:
You can insert array literals into array columns. The inserts in the following example each have three array values, and demonstrate how you can:
Create a table with variable-length and fixed-length array columns.
Insert NULL
arrays into these colums.
Specify and insert array literals using {...}
or ARRAY[...]
syntax.
Insert empty variable-length arrays using{}
and ARRAY[]
syntax.
Insert array values that contain NULL
elements.
If you create a table with column that has a default value, or alter a table to add a column with a default value, using the INSERT command creates a record that includes the default value if it is omitted from the INSERT. For example, assume a table created as follows:
If you omit the name column from an INSERT or INSERT FROM SELECT statement, the missing value for column name
is set to 'John Doe'
.
INSERT INTO tbl (id, age) VALUES (1, 36);
creates the record 1|'John Doe'|36
.
INSERT INTO tbl (id, age) SELECT id, age FROM old_tbl;
also sets all the name values to John Doe
.
Deletes rows that satisfy the WHERE
clause from the specified table. If the WHERE clause is absent, all rows in the table are deleted, resulting in a valid but empty table.
In Release 6.4 and higher, you can run DELETE queries across tables in different databases on the same HEAVY.AI cluster without having to first connect to those databases.
To execute queries against another database, you must have ACCESS privilege on that database, as well as DELETE privilege.
Delete rows from a table in the my_other_db
database:
Interrupt a queued query. Specify the query by using its session ID.
To see the queries in the queue, use the SHOW QUERIES command:
To interrupt the last query in the list (ID 946-ooNP
):
Showing the queries again indicates that 946-ooNP
has been deleted:
KILL QUERY is only available if the runtime query interrupt parameter (enable-runtime-query-interrupt
) is set.
Interrupting a query in ‘PENDING_QUEUE’ status is supported in both distributed and single-server mode.
To enable query interrupt for tables imported from data files in local storage, set enable_non_kernel_time_query_interrupt
to TRUE. (It is enabled by default.)
Shows generated Intermediate Representation (IR) code, identifying whether it is executed on GPU or CPU. This is primarily used internally by HEAVY.AI to monitor behavior.
For example, when you use the EXPLAIN
command on a basic statement, the utility returns 90 lines of IR code that is not meant to be human readable. However, at the top of the listing, a heading indicates whether it is IR for the CPU
or IR for the GPU
, which can be useful to know in some situations.
Returns a relational algebra tree describing the high-level plan to execute the statement.
The table below lists the relational algebra classes used to describe the execution plan for a SQL statement.
For example, a SELECT
statement is described as a table scan and projection.
If you add a sort order, the table projection is folded under a LogicalSort
procedure.
When the SQL statement is simple, the EXPLAIN CALCITE version is actually less “human readable.” EXPLAIN CALCITE is more useful when you work with more complex SQL statements, like the one that follows. This query performs a scan on the BOOK table before scanning the BOOK_ORDER table.
Revising the original SQL command results in a more natural selection order and a more performant query.
Augments the EXPLAIN CALCITE command by adding details about referenced columns in the query plan.
For example, for the following EXPLAIN CALCITE command execution:
EXPLAIN CALCITE DETAILED adds more column details as seen below:
Change a parameter value for the current session.
Paremeter name | Values |
---|---|
Switch to another database without need of re-login.
Your session will silently switch to the requested database.
The database exists, but the user does not have access to it:
The database does not exist:
Force the session to run the subsequent SQL commands in CPU mode:
Switch back the session to run in GPU mode
Geospatial and array column projections are not supported in the COALESCE
function and CASE expressions.
You can use a subquery anywhere an expression can be used, subject to any runtime constraints of that expression. For example, a subquery in a CASE statement must return exactly one row, but a subquery can return multiple values to an IN expression.
You can use a subquery anywhere a table is allowed (for example, FROM
subquery), using aliases to name any reference to the table and columns returned by the subquery.
HEAVY.AI supports arrays in dictionary-encoded text and number fields (TINYINT, SMALLINT, INTEGER, BIGINT, FLOAT, and DOUBLE). Data stored in arrays are not normalized. For example, {green,yellow} is not the same as {yellow,green}. As with many SQL-based services, OmniSci array indexes are 1-based.
HEAVY.AI supports NULL variable-length arrays for all integer and floating-point data types, including dictionary-encoded string arrays. For example, you can insert NULL
into BIGINT[ ], DOUBLE[ ], or TEXT[ ] columns. HEAVY.AI supports NULL fixed-length arrays for all integer and floating-point data types, but not for dictionary-encoded string arrays. For example, you can insert NULL
into BIGINT[2] DOUBLE[3], but not into TEXT[2] columns.
Expression | Description |
---|
The following examples show query results based on the table test_array
created with the following statement:
The following queries use arrays in an INTEGER field:
The SELECT command returns a set of records from one or more tables.
Sort order defaults to ascending (ASC).
Sorts null values after non-null values by default in an ascending sort, before non-null values in a descending sort. For any query, you can use NULLS FIRST to sort null values to the top of the results or NULLS LAST to sort null values to the bottom of the results.
Allows you to use a positional reference to choose the sort column. For example, the command SELECT colA,colB FROM table1 ORDER BY 2
sorts the results on colB
because it is in position 2.
HEAVY.AI provides various query hints for controlling the behavior of the query execution engine.
SELECT hints must appear first, immediately after the SELECT statement; otherwise, the query fails.
By default, a hint is applied to the query step in which it is defined. If you have multiple SELECT clauses and define a query hint in one of those clauses, the hint is applied only to the specific query step; the rest of the query steps are unaffected. For example, applying the /*+ cpu_mode */
hint affects only the SELECT clause in which it exists.
You can define a hint to apply to all query steps by prepending g_
to the query hint. For example, if you define /*+ g_cpu_mode */
, CPU execution is applied to all query steps.
HEAVY.AI supports the following query hints.
The marker hint type represents a Boolean flag.
The key-value pair type is a hint name and its value.
In Release 6.4 and higher, you can run SELECT queries across tables in different databases on the same HEAVY.AI cluster without having to first connect to those databases. This enables more efficient storage and memory utilization by eliminating the need for table duplication across databases, and simplifies access to shared data and tables.
To execute queries against another database, you must have ACCESS privilege on that database, as well as SELECT privilege.
Execute a join query involving a table in the current database and another table in the my_other_db
database:
Usage Notes
SQL normally assumes that terms in the WHERE
clause that cannot be used by indices are usually true. If this assumption is incorrect, it could lead to a suboptimal query plan. Use the LIKELY(X)
and UNLIKELY(X)
SQL functions to provide hints to the query planner about clause terms that are probably not true, which helps the query planner to select the best possible plan.
Use LIKELY
/UNLIKELY
to optimize evaluation of OR
/AND
logical expressions. LIKELY
/UNLIKELY
causes the left side of an expression to be evaluated first. This allows the right side of the query to be skipped when possible. For example, in the clause UNLIKELY(A) AND B
, if A
evaluates to FALSE
, B
does not need to be evaluated.
Consider the following:
If x
is one of the values 7
, 8
, 9
, or 10
, the filter y > 42
is applied. If x
is not one of those values, the filter y > 42
is not applied.
Changes the values of the specified columns based on the assign
argument (identifier=expression
) in all rows that satisfy the condition in the WHERE
clause.
Currently, HEAVY.AI does not support updating a geo column type (POINT, MULTIPOINT, LINESTRING, MULTILINESTRING, POLYGON, or MULTIPOLYGON) in a table.
You can update a table via subquery, which allows you to update based on calculations performed on another table.
Examples
In Release 6.4 and higher, you can run UPDATE queries across tables in different databases on the same HEAVY.AI cluster without having to first connect to those databases.
To execute queries against another database, you must have ACCESS privilege on that database, as well as UPDATE privilege.
Update a row in a table in the my_other_db
database:
For more information, see .
Hint | Details | Example |
---|
Hint | Details | Example |
---|
Method
Description
LogicalAggregate
Operator that eliminates duplicates and computes totals.
LogicalCalc
Expression that computes project expressions and also filters.
LogicalChi
Operator that converts a stream to a relation.
LogicalCorrelate
Operator that performs nested-loop joins.
LogicalDelta
Operator that converts a relation to a stream.
LogicalExchange
Expression that imposes a particular distribution on its input without otherwise changing its content.
LogicalFilter
Expression that iterates over its input and returns elements for which a condition evaluates to true.
LogicalIntersect
Expression that returns the intersection of the rows of its inputs.
LogicalJoin
Expression that combines two relational expressions according to some condition.
LogicalMatch
Expression that represents a MATCH_RECOGNIZE node.
LogicalMinus
Expression that returns the rows of its first input minus any matching rows from its other inputs. Corresponds to the SQL EXCEPT operator.
LogicalProject
Expression that computes a set of ‘select expressions’ from its input relational expression.
LogicalSort
Expression that imposes a particular sort order on its input without otherwise changing its content.
LogicalTableFunctionScan
Expression that calls a table-valued function.
LogicalTableModify
Expression that modifies a table. Similar to TableScan, but represents a request to modify a table instead of read from it.
LogicalTableScan
Reads all the rows from a RelOptTable.
LogicalUnion
Expression that returns the union of the rows of its inputs, optionally eliminating duplicates.
LogicalValues
Expression for which the value is a sequence of zero or more literal row values.
LogicalWindow
Expression representing a set of window aggregates. See Window Functions
EXECUTOR_DEVICE
CPU - Set the session to CPU execution mode:
ALTER SESSION SET EXECUTOR_DEVICE='CPU';
GPU - Set the session to GPU execution mode:
ALTER SESSION SET EXECUTOR_DEVICE='GPU';
NOTE: These parameter values have the same effect as the \cpu
and \gpu
commands in heavysql, but can be used with any tool capable of running sql commands.
CURRENT_DATABASE
Can be set to any string value.
If the value is a valid database name, and the current user has access to it, the session switches to the new database. If the user does not have access or the database does not exist, an error is returned and the session will fall back to the starting database.
Operator | Description |
| Logical AND |
| Negates value |
| Logical OR |
Expression | Description |
| Case operator |
| Returns the first non-null value in the list |
Expression | Description |
| Evaluates whether expr equals any value of the IN list. |
| Evaluates whether expr does not equal any value of the IN list. |
allow_loop_join | Enable loop joins. |
|
cpu_mode | Force CPU execution mode. |
|
columnar_output | Enable columnar output for the input query. |
|
disable_loop_join | Disable loop joins. |
|
dynamic_watchdog | Enable dynamic watchdog. |
|
dynamic_watchdog_off | Disable dynamic watchdog. |
|
force_baseline_hash_join | Use the baseline hash join scheme by skipping the perfect hash join scheme, which is used by default. |
|
force_one_to_many_hash_join | Deploy a one-to-many hash join by skipping one-to-one hash join, which is used by default. |
|
keep_result | Add result set of the input query to the result set cache. |
|
keep_table_function_result | Add result set of the table function query to the result set cache. |
|
overlaps_allow_gpu_build | Use GPU (if available) to build an overlaps join hash table. (CPU is used by default.) |
|
overlaps_no_cache | Skip adding an overlaps join hash table to the hash table cache. |
|
rowwise_output | Enable row-wise output for the input query. |
|
watchdog | Enable watchdog. |
|
watchdog_off | Disable watchdog. |
|
aggregate_tree_fanout | Defines a fan out of a tree used to compute window aggregation over frame. Depending on the frame size, the tree fanout affects the performance of aggregation and the tree construction for each window function with a frame clause.
|
|
loop_join_inner_table_max_num_rows | Set the maximum number of rows available for a loop join.
| Set the maximum number of rows to 100:
|
max_join_hash_table_size | Set the maximum size of the hash table.
| Set the maximum size of the join hash table to 100:
|
overlaps_bucket_threshold | Set the overlaps bucket threshold.
| Set the overlaps threshold to 10:
|
overlaps_max_size | Set the maximum overlaps size.
| Set the maximum overlap to 10:
|
overlaps_keys_per_bin | Set the number of overlaps keys per bin.
|
|
query_time_limit | Set the maximum time for the query to run.
|
|
Expression | Description |
| Provides a hint to the query planner that argument |
| Provides a hint to the query planner that argument |
If a join column name or alias is not unique, it must be prefixed by its table name.
You can use BIGINT, INTEGER, SMALLINT, TINYINT, DATE, TIME, TIMESTAMP, or TEXT ENCODING DICT data types. TEXT ENCODING DICT is the most efficient because corresponding dictionary IDs are sequential and span a smaller range than, for example, the 65,535 values supported in a SMALLINT field. Depending on the number of values in your field, you can use TEXT ENCODING DICT(32) (up to approximately 2,150,000,000 distinct values), TEXT ENCODING DICT(16) (up to 64,000 distinct values), or TEXT ENCODING DICT(8) (up to 255 distinct values). For more information, see Data Types and Fixed Encoding.
When possible, joins involving a geospatial operator (such as ST_Contains
) build a binned spatial hash table (overlaps hash join), falling back to a Cartesian loop join if a spatial hash join cannot be constructed.
The enable-overlaps-hashjoin
flag controls whether the system attempts to use the overlaps spatial join strategy (true
by default). If enable-overlaps-hashjoin
is set to false, or if the system cannot build an overlaps hash join table for a geospatial join operator, the system attempts to fall back to a loop join. Loop joins can be performant in situations where one or both join tables have a small number of rows. When both tables grow large, loop join performance decreases.
Two flags control whether or not the system allows loop joins for a query (geospatial for not): allow-loop-joins
and trivial-loop-join-threshold
. By default, allow-loop-joins
is set to false
and trivial-loop-join-threshold
to 1,000 (rows). If allow allow-loop-joins
is set to true
, the system allows any query with a loop join, regardless of table cardinalities (measured in number of rows). If left to the implicit default of false
or set explicitly to false
, the system allows loop join queries as long as the inner table (right-side table) has fewer rows than the threshold specified by trivial-loop-join-threshold
.
For optimal performance, the system should utilize overlaps hash joins whenever possible. Use the following guidelines to maximize the use of the overlaps hash join framework and minimize fallback to loop joins when conducting geospatial joins:
The inner (right-side) table should always be the more complicated primitive. For example, for ST_Contains(polygon, point)
, the point table should be the outer (left) table and the polygon table should be the inner (right) table.
Currently, ST_CONTAINS
and ST_INTERSECTS
joins between point and polygons/multi-polygon tables, and ST_DISTANCE < {distance}
between two point tables are supported for accelerated overlaps hash join queries.
For pointwise-distance joins, only the pattern WHERE ST_DISTANCE(table_a.point_col, table_b.point_col) < distance_in_degrees
supports overlaps hash joins. Patterns like the following fall back to loop joins:
WHERE ST_DWITHIN(table_a.point_col, table_b.point_col, distance_in_degrees)
WHERE ST_DISTANCE(ST_TRANSFORM(table_a.point_col, 900913), ST_TRANSFORM(table_b.point_col, 900913)) < 100
You can create joins in a distributed environment in two ways:
Replicate small dimension tables that are used in the join.
Create a shard key on the column used in the join (note that there is a limit of one shard key per table). If the column involved in the join is a TEXT ENCODED field, you must create a SHARED DICTIONARY that references the FACT table key you are using to make the join.
The join order for one small table and one large table matters. If you swap the sales and customer tables on the join, it throws an exception stating that table "sales" must be replicated.
| Returns value(s) from specific location |
| Extract the values in the array to a set of rows. Requires |
|
|
|
|
| Returns the number of elements in an array. For example: |
Use SHOW
commands to get information about databases, tables, and user sessions.
Shows the CREATE SERVER statement that could have been used to create the server.
Shows the CREATE TABLE statement that could have been used to create the table.
Retrieve the databases accessible for the current user, showing the database name and owner.
Show registered compile-time UDFs and extension functions in the system and their arguments.
Displays a list of all row-level security (RLS) policies that exist for a user or role; admin rights are required. If EFFECTIVE is used, the list also includes any policies that exist for all roles that apply to the requested user or role.
Returns a list of queued queries in the system; information includes session ID, status, query string, account login name, client address, database name, and device type (CPU or GPU).
Admin users can see and interrupt all queries, and non-admin users can see and interrupt only their own queries
NOTE: SHOW QUERIES is only available if the runtime query interrupt parameter (enable-runtime-query-interrupt
) is set.
To interrupt a query in the queue, see KILL QUERY.
If included with a name, lists the role granted directly to a user or role. SHOW EFFECTIVE ROLES with a name lists the roles directly granted to a user or role, and also lists the roles indirectly inherited through the directly granted roles.
If the user name or role name is omitted, then a regular user sees their own roles, and a superuser sees a list of all roles existing in the system.
Show user-defined runtime functions and table functions.
Show data connectors.
Displays storage-related information for a table, such as the table ID/name, number of data/metadata files used by the table, total size of data/metadata files, and table epoch values.
You can see table details for all tables that you have access to in the current database, or for only those tables you specify.
Show details for all tables you have access to:
Show details for table omnisci_states
:
The number of columns returned includes system columns. As a result, the number of columns in column_count
can be up to two greater than the number of columns created by the user.
Displays the list of available system (built-in) table functions.
For more information, see System Table Functions.
Show detailed output information for the specified table function. Output details vary depending on the table function specified.
View SHOW output for the generate_series
table function:
Retrieve the servers accessible for the current user.
Retrieve the tables accessible for the current user.
Lists name, ID, and default database for all or specified users for the current database. If the command is issued by a superuser, login permission status is also shown. Only superusers see users who do not have permission to log in.
SHOW [ALL] USER DETAILS lists name, ID, superuser status, default database, and login permission status for all users across the HeavyDB instance. This variant of the command is available only to superusers. Regular users who run the SHOW ALL USER DETAILS command receive an error message.
Show all user details for all users:
Show all user details for specified users ue, ud, ua, and uf:
If a specified user is not found, the superuser sees an error message:
Show user details for specified users ue, ud, and uf:
Show user details for all users:
Running SHOW ALL USER DETAILS results in an error message:
Show user details for all users:
If a specified user is not found, the user sees an error message:
Show user details for user ua:
Retrieve all persisted user sessions, showing the session ID, user login name, client address, and database name. Admin or superuser privileges required.
Interrupt a queued query. Specify the query by using its session ID.
To see the queries in the queue, use the SHOW QUERIES command:
To interrupt the last query in the list (ID 946-ooNP
):
Showing the queries again indicates that 946-ooNP
has been deleted:
KILL QUERY is only available if the runtime query interrupt parameter (enable-runtime-query-interrupt
) is set.
Interrupting a query in ‘PENDING_QUEUE’ status is supported in both distributed and single-server mode.
To enable query interrupt for tables imported from data files in local storage, set enable_non_kernel_time_query_interrupt
to TRUE. (It is enabled by default.)
Output Header | Output Details |
---|---|
name
generate_series
signature
(i64 series_start, i64 series_stop, i64 series_step)
(i64 series_start, i64 series_stop) -> Column
input_names
series_start, series_stop, series_step
series_start, series_stop
input_types
i64
output_names
generate_series
output_types
Column i64
CPU
true
GPU
true
runtime
false
filter_table_transpose
false
Expression
Example
Description
CAST(expr AS type
)
CAST(1.25 AS FLOAT)
Converts an expression to another data type. For conversions to a TEXT type, use TRY_CAST.
TRY_CAST(text_expr AS type
)
CAST('1.25' AS FLOAT)
Converts a text to a non-text type, returning null if the conversion could not be successfully performed.
ENCODE_TEXT(none_encoded_str
)
ENCODE_TEXT(long_str)
Converts a none-encoded text type to a dictionary-encoded text type.
FROM/TO:
TINYINT
SMALLINT
INTEGER
BIGINT
FLOAT
DOUBLE
DECIMAL
TEXT
BOOLEAN
DATE
TIME
TIMESTAMP
TINYINT
-
Yes
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
n/a
SMALLINT
Yes
-
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
n/a
INTEGER
Yes
Yes
-
Yes
Yes
Yes
Yes
Yes
Yes
No
No
No
BIGINT
Yes
Yes
Yes
-
Yes
Yes
Yes
Yes
No
No
No
No
FLOAT
Yes
Yes
Yes
Yes
-
Yes
No
Yes
No
No
No
No
DOUBLE
Yes
Yes
Yes
Yes
Yes
-
No
Yes
No
No
No
n/a
DECIMAL
Yes
Yes
Yes
Yes
Yes
Yes
-
Yes
No
No
No
n/a
TEXT
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
-
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
Yes (Use TRY_CAST)
BOOLEAN
No
No
Yes
No
No
No
No
Yes
-
n/a
n/a
n/a
DATE
No
No
No
No
No
No
No
Yes
n/a
-
No
Yes
TIME
No
No
No
No
No
No
No
Yes
n/a
No
-
n/a
TIMESTAMP
No
No
No
No
No
No
No
Yes
n/a
Yes
No
-