HEAVY.AI Docs
v8.1.0
v8.1.0
  • Welcome to HEAVY.AI Documentation
  • Overview
    • Overview
    • Release Notes
  • Installation and Configuration
    • System Requirements
      • Hardware Reference
      • Software Requirements
      • Licensing
    • Installation
      • Free Version
      • Installing on Docker
        • HEAVY.AI Installation using Docker on Ubuntu
      • Installing on Ubuntu
        • HEAVY.AI Installation on Ubuntu
        • Install NVIDIA Drivers and Vulkan on Ubuntu
      • Installing on Rocky Linux / RHEL
        • HEAVY.AI Installation on RHEL
        • Install NVIDIA Drivers and Vulkan on Rocky Linux and RHEL
      • Getting Started on AWS
      • Getting Started on GCP
      • Getting Started on Azure
      • Getting Started on Kubernetes (BETA)
      • Upgrading
        • Upgrading HEAVY.AI
        • Upgrading from Omnisci to HEAVY.AI 6.0
        • CUDA Compatibility Drivers
      • Uninstalling
      • Ports
    • Services and Utilities
      • Using Services
      • Using Utilities
    • Executor Resource Manager
    • Configuration Parameters
      • Overview
      • Configuration Parameters for HeavyDB
      • Configuration Parameters for HEAVY.AI Web Server
      • Configuration Parameters for HeavyIQ
    • Security
      • Roles and Privileges
        • Column-Level Security
      • Connecting Using SAML
      • Implementing a Secure Binary Interface
      • Encrypted Credentials in Custom Applications
      • LDAP Integration
    • Distributed Configuration
  • Loading and Exporting Data
    • Supported Data Sources
      • Kafka
      • Using HeavyImmerse Data Manager
      • Importing Geospatial Data
    • Command Line
      • Loading Data with SQL
      • Exporting Data
  • SQL
    • Data Definition (DDL)
      • Datatypes
      • Users and Databases
      • Tables
      • System Tables
      • Views
      • Policies
      • Comment
    • Data Manipulation (DML)
      • SQL Capabilities
        • ALTER SESSION SET
        • ALTER SYSTEM CLEAR
        • DELETE
        • EXPLAIN
        • INSERT
        • KILL QUERY
        • LIKELY/UNLIKELY
        • SELECT
        • SHOW
        • UPDATE
        • Arrays
        • Logical Operators and Conditional and Subquery Expressions
        • Table Expression and Join Support
        • Type Casts
      • Geospatial Capabilities
        • Uber H3 Hexagonal Modeling
      • Functions and Operators
      • System Table Functions
        • generate_random_strings
        • generate_series
        • tf_compute_dwell_times
        • tf_feature_self_similarity
        • tf_feature_similarity
        • tf_geo_rasterize
        • tf_geo_rasterize_slope
        • tf_graph_shortest_path
        • tf_graph_shortest_paths_distances
        • tf_load_point_cloud
        • tf_mandelbrot*
        • tf_point_cloud_metadata
        • tf_raster_contour_lines; tf_raster_contour_polygons
        • tf_raster_graph_shortest_slope_weighted_path
        • tf_rf_prop_max_signal (Directional Antennas)
        • ts_rf_prop_max_signal (Isotropic Antennas)
        • tf_rf_prop
      • Window Functions
      • Reserved Words
      • SQL Extensions
      • HeavyIQ LLM_TRANSFORM
  • HeavyImmerse
    • Introduction to HeavyImmerse
    • Admin Portal
    • Control Panel
    • Working with Dashboards
      • Dashboard List
      • Creating a Dashboard
      • Configuring a Dashboard
      • Duplicating and Sharing Dashboards
    • Measures and Dimensions
    • Using Parameters
    • Using Filters
    • Using Cross-link
    • Chart Animation
    • Multilayer Charts
    • SQL Editor
    • Customization
    • Joins (Beta)
    • Chart Types
      • Overview
      • Bubble
      • Choropleth
      • Combo
      • Contour
      • Cross-Section
      • Gauge
      • Geo Heatmap
      • Heatmap
      • Linemap
      • Number
      • Pie
      • Pointmap
      • Scatter Plot
      • Skew-T
      • Table
      • Text Widget
      • Wind Barb
    • Deprecated Charts
      • Bar
      • Combo - Original
      • Histogram
      • Line
      • Stacked Bar
    • HeavyIQ SQL Notebook
  • HEAVYIQ Conversational Analytics
    • HeavyIQ Overview
      • HeavyIQ Guidance
  • HeavyRF
    • Introduction to HeavyRF
    • Getting Started
    • HeavyRF Table Functions
  • HeavyConnect
    • HeavyConnect Release Overview
    • Getting Started
    • Best Practices
    • Examples
    • Command Reference
    • Parquet Data Wrapper Reference
    • ODBC Data Wrapper Reference
    • Raster Data Wrapper Reference
  • HeavyML (BETA)
    • HeavyML Overview
    • Clustering Algorithms
    • Regression Algorithms
      • Linear Regression
      • Random Forest Regression
      • Decision Tree Regression
      • Gradient Boosting Tree Regression
    • Principal Components Analysis
  • Python / Data Science
    • Data Science Foundation
    • JupyterLab Installation and Configuration
    • Using HEAVY.AI with JupyterLab
    • Python User-Defined Functions (UDFs) with the Remote Backend Compiler (RBC)
      • Installation
      • Registering and Using a Function
      • User-Defined Table Functions
      • RBC UDF/UDTF Example Notebooks
      • General UDF/UDTF Tutorial Notebooks
      • RBC API Reference
    • Ibis
    • Interactive Data Exploration with Altair
    • Additional Examples
      • Forecasting with HEAVY.AI and Prophet
  • APIs and Interfaces
    • Overview
    • heavysql
    • Thrift
    • JDBC
    • ODBC
    • Vega
      • Vega Tutorials
        • Vega at a Glance
        • Getting Started with Vega
        • Getting More from Your Data
        • Creating More Advanced Charts
        • Using Polys Marks Type
        • Vega Accumulator
        • Using Transform Aggregation
        • Improving Rendering with SQL Extensions
      • Vega Reference Overview
        • data Property
        • projections Property
        • scales Property
        • marks Property
      • Migration
        • Migrating Vega Code to Dynamic Poly Rendering
      • Try Vega
    • RJDBC
    • SQuirreL SQL
    • heavyai-connector
  • Tutorials and Demos
    • Loading Data
    • Using Heavy Immerse
    • Hello World
    • Creating a Kafka Streaming Application
    • Getting Started with Open Source
    • Try Vega
  • Troubleshooting and Special Topics
    • FAQs
    • Troubleshooting
    • Vulkan Renderer
    • Optimizing
    • Known Issues and Limitations
    • Logs and Monitoring
    • Archived Release Notes
      • Release 6.x
      • Release 5.x
      • Release 4.x
      • Release 3.x
Powered by GitBook
On this page
  • ORDER BY
  • Query Hints
  • Cross-Database Queries
Export as PDF
  1. SQL
  2. Data Manipulation (DML)
  3. SQL Capabilities

SELECT

The SELECT command returns a set of records from one or more tables.

query:
  |   WITH withItem [ , withItem ]* query
  |   {
          select
      }
      [ ORDER BY orderItem [, orderItem ]* ]
      [ LIMIT [ start, ] { count | ALL } ]
      [ OFFSET start { ROW | ROWS } ]

withItem:
      name
      [ '(' column [, column ]* ')' ]
      AS '(' query ')'

orderItem:
      expression [ ASC | DESC ] [ NULLS FIRST | NULLS LAST ]

select:
      SELECT [ DISTINCT ] [/*+ hints */]
          { * | projectItem [, projectItem ]* }    
      FROM tableExpression
      [ WHERE booleanExpression ]
      [ GROUP BY { groupItem [, groupItem ]* } ]
      [ HAVING booleanExpression ]
      [ WINDOW window_name AS ( window_definition ) [, ...] ]

projectItem:
      expression [ [ AS ] columnAlias ]
  |   tableAlias . *

tableExpression:
      tableReference [, tableReference ]*
  |   tableExpression [ ( LEFT ) [ OUTER ] ] JOIN tableExpression [ joinCondition ]

joinCondition:
      ON booleanExpression
  |   USING '(' column [, column ]* ')'

tableReference:
      tablePrimary
      [ [ AS ] alias ]

tablePrimary:
      [ catalogName . ] tableName
  |   '(' query ')'

groupItem:
      expression
  |   '(' expression [, expression ]* ')'

ORDER BY

  • Sort order defaults to ascending (ASC).

  • Sorts null values after non-null values by default in an ascending sort, before non-null values in a descending sort. For any query, you can use NULLS FIRST to sort null values to the top of the results or NULLS LAST to sort null values to the bottom of the results.

  • Allows you to use a positional reference to choose the sort column. For example, the command SELECT colA,colB FROM table1 ORDER BY 2 sorts the results on colB because it is in position 2.

Query Hints

HEAVY.AI provides various query hints for controlling the behavior of the query execution engine.

Syntax

SELECT /*+ hint */ FROM ...;

SELECT hints must appear first, immediately after the SELECT statement; otherwise, the query fails.

By default, a hint is applied to the query step in which it is defined. If you have multiple SELECT clauses and define a query hint in one of those clauses, the hint is applied only to the specific query step; the rest of the query steps are unaffected. For example, applying the /* cpu_mode */ hint affects only the SELECT clause in which it exists.

You can define a hint to apply to all query steps by prepending g_ to the query hint. For example, if you define /*+ g_cpu_mode */, CPU execution is applied to all query steps.

HEAVY.AI supports the following query hints.

The marker hint type represents a Boolean flag.

Hint
Details
Example

allow_loop_join

Enable loop joins.

SELECT /+* allow_loop_join */ ...

cpu_mode

Force CPU execution mode.

SELECT /*+ cpu_mode */ ...

columnar_output

Enable columnar output for the input query.

SELECT /+* columnar_output */ ...

disable_loop_join

Disable loop joins.

SELECT /+* disable_loop_join */ ...

dynamic_watchdog

Enable dynamic watchdog.

SELECT /+* dynamic_watchdog */ ...

dynamic_watchdog_off

Disable dynamic watchdog.

SELECT /+* dynamic_watchdog_off */ ...

force_baseline_hash_join

Use the baseline hash join scheme by skipping the perfect hash join scheme, which is used by default.

SELECT /+* force_baseline_hash_join */ ...

force_one_to_many_hash_join

Deploy a one-to-many hash join by skipping one-to-one hash join, which is used by default.

SELECT /+* force_one_to_many_hash_join */ ...

keep_result

Add result set of the input query to the result set cache.

SELECT /+* keep_result */ ...

keep_table_function_result

Add result set of the table function query to the result set cache.

SELECT /+* keep_table_function_result */ ...

overlaps_allow_gpu_build

Use GPU (if available) to build an overlaps join hash table. (CPU is used by default.)

SELECT /+* overlaps_allow_gpu_build */ ...

overlaps_no_cache

Skip adding an overlaps join hash table to the hash table cache.

SELECT /+* overlaps_no_cache */ ...

rowwise_output

Enable row-wise output for the input query.

SELECT /+* rowwise_output */ ...

watchdog

Enable watchdog.

SELECT /+* watchdog */ ...

watchdog_off

Disable watchdog.

SELECT /+* watchdog_off */ ...

The key-value pair type is a hint name and its value.

Hint
Details
Example

aggregate_tree_fanout

Defines a fan out of a tree used to compute window aggregation over frame. Depending on the frame size, the tree fanout affects the performance of aggregation and the tree construction for each window function with a frame clause.

  • Value type: INT

  • Range: 0-1024

SELECT /+* aggregate_tree_fanout(32) */ SUM(y) OVER (ORDER BY x ROWS BETWEEN ...) ...

loop_join_inner_table_max_num_rows

Set the maximum number of rows available for a loop join.

  • Value type: INT

  • Range: 0 < x

Set the maximum number of rows to 100: SELECT /+* loop_join_inner_table_max_num_rows(100) */ ...

max_join_hash_table_size

Set the maximum size of the hash table.

  • Value type: INT

  • Range: 0 < x

Set the maximum size of the join hash table to 100:

SELECT /+* max_join_hash_table_size(100) */ ...

overlaps_bucket_threshold

Set the overlaps bucket threshold.

  • Value type: DOUBLE

  • Range: 0-90

Set the overlaps threshold to 10:

SELECT /*+ overlaps_bucket_threshold(10.0) */ ...

overlaps_max_size

Set the maximum overlaps size.

  • Value type: INTEGER

  • Range: >=0

Set the maximum overlap to 10: SELECT /*+ overlaps_max_size(10.0) */ ...

overlaps_keys_per_bin

Set the number of overlaps keys per bin.

  • Value type: DOUBLE

  • Range: 0.0 < x < double::max

SELECT /+* overlaps_keys_per_bin(0.1) */ ...

query_time_limit

Set the maximum time for the query to run.

  • Value type: INTEGER

  • Range: >=0

SELECT /+* query_time_limit(1000) */ ...

Cross-Database Queries

In Release 6.4 and higher, you can run SELECT queries across tables in different databases on the same HEAVY.AI cluster without having to first connect to those databases. This enables more efficient storage and memory utilization by eliminating the need for table duplication across databases, and simplifies access to shared data and tables.

To execute queries against another database, you must have ACCESS privilege on that database, as well as SELECT privilege.

Example

Execute a join query involving a table in the current database and another table in the my_other_db database:

SELECT name, saleamt, saledate FROM my_other_db.customers AS c, sales AS s 
  WHERE c.id = s.customerid;
PreviousLIKELY/UNLIKELYNextSHOW

For more information, see .

SELECT