Release 6.x

Archived Release Notes for v6.x

6.1.1 | 6.1.0| 6.0.0

6.1.1 - July 26, 2022

HeavyDB - New Features and Improvements

  • Adds support for POLYGON to MULTIPOLYGON promotion in the load table Thrift APIs and SQLImporter.

HeavyDB - Fixed Issues

  • Fixes an issue that caused an intermittent KafkaImporter crash on CentOS 7.9.

  • Fixes an issue that cause incorrect results in multiple aggregation of date columns that include COUNT DISTINCT.

Heavy Immerse - New Features and Improvements

  • Adds support for limiting the number of charts in a dashboard through the ui/limit_charts_per_dashboard feature flag. The default value is 0 (no limit).

6.1.0 - July 7, 2022

HeavyDB - New Features

Administrative Controls and Feedback

  • Adds a new set of log-based (request_logs, server_logs, web_server_logs, and web_server_access_logs) system tables.

  • Adds a new Request Logs and Monitoring dashboard.

  • Adds a new SHOW CREATE SERVER command, which displays the create server DDL for a specified foreign server.

  • Adds support for non-super-user execution of SHOW CREATE TABLE on views.

  • Adds a new ALTER SESSION SET EXECUTOR_DEVICE command, which updates the type of executor device (CPU or GPU) for the current session.

  • Adds a new ALTER SESSION SET CURRENT_DATABASE command, which updates the connected database for the current session.

  • Adds a new ALTER DATABASE OWNER TO command, which allows super users to change the owner of a database.

Core SQL Functionality Improvements

  • Extends the INSERT command to support inserting multiple rows at once/batch insert.

  • Add support for default values on shard key columns.

  • Add initial support for Window function framing, including support for BETWEEN ROWS clause for all numeric and date/time types, and BETWEEN RANGE clause for numeric types.

  • Enable group-by push down for UNION ALL such that group-by and aggregate operations applied to the output of UNION ALL are evaluated on the UNION ALL inputs, improving performance.

  • Add support for LCASE (alias for LOWER), UCASE (alias for UPPER), LEFT, and RIGHT string functions

Import / Export

  • Adds a new trim_spaces option for delimited file import.

  • (BETA) Adds data import/COPY FROM support from Relational Database Management Systems and Data Warehouses using the Open Database Connectivity (ODBC) interface.

Performance Enhancements

  • Initial support for CUDA streams to parallelize GPU computation and memory transfers.

  • Increase per-GPU projection limit with watchdog enabled from 32M to 128M rows to take advantage of improvements in large projection support in recent releases.

Extensibility Infrastructure

  • Add new SHOW FUNCTIONS and SHOW FUNCTIONS DETAILS commands to show registered compile-time UDFs and extension functions in the system and their arguments, and SHOW RUNTIME FUNCTIONS [DETAILS] and SHOW RUNTIME TABLE FUNCTIONS [DETAILS] to show user-defined runtime functions/table functions.

  • Support timestamp inputs and outputs for table functions.

Advanced Analytics

  • Add tf_compute_dwell_times table function, that given a query input input with entity keys and timestamps, and parameters specifying the minimum session time, minimum number of session records, and max inactive seconds, outputs all unique sessions found in the data with the duration of the session (dwell time).

  • Add tf_feature_self_similarity table function, that given a query input of entity keys/IDs, a set of feature columns, and a metric column, scores each pair of entities based on their similarity, computed as the cosine similarity of the feature column(s) between each entity pair, which can optionally TF/IDF weighted.

  • Add tf_feature_similarity table function, that given a query input of entity keys, feature columns, and a metric column, and a second query input specifying a search vector of feature columns and metric, computes the similarity of each entity in the first input to the search vector based on their similarity, computed as the cosine similarity of the feature column(s) for each entity with the feature column(s) for the search vector, which can optionally be TF/IDF weighted.

HeavyDB - Fixed Issues

  • Fixed an issue where some join queries on ODBC-backed foreign tables can return empty result sets for the first query.

  • Fixed an issue where append refreshes on foreign tables backed by delimited or regex-parsed files ignore file-path filter and sort options.

  • Fixed a crash that can occur when very large dates are specified for the refresh_start_date_time foreign table option.

  • Fixed a crash that can occur when a foreign table’s data source is updated within a refresh window.

  • Fixed an issue where databases owned by deleted user accounts are not visible, and adds a restriction that prevents dropping users who own databases.

  • Fixed an issue where joins on string dictionary-encoded columns would hit spurious none-encoded string translation

  • Fixed issue with certain UNION ALL query patterns, such as UNION ALL containing logical values.

  • Disabled KEY_FOR_STRING for UNNEST operations on string dictionary-encoded columns, to prevent a crash.

  • Fixed an issue where logged stats for raster imports could overflow.

  • Fixed an issue where joins on synthetic tables (for example, created with a VALUES statement or table function without an underlying table) could crash.

  • Fixed an issue where require checks used on string dictionary inputs to a table function could crash.

  • Fixed a crash and/or wrong query results that can occur when a decimal literal is used in a nested query.

HeavyRender - Fixed Issues

  • Fixed a potential crash when attempting to auto-retry a render immediately after an OutOfGpuMemory exception is thrown. This crash can occur only if the render-oom-retry-threshold configuration option is set.

  • Fixed a regression where polygons with transparent colors are rendered opaque.

  • Corrects an issue with point/symbol rendering with explicitly Vega projections where the projection was not being updated when panned/zoomed if the query did not change.

  • Significant improvements in hit-testing consistency and stability when rendering queries with subqueries, window functions, or table functions.

Heavy Immerse - New Features

Formatting and Cartography

  • Font size controls.

  • Borders and Zebra Striping in Table charts.

  • Justify content in Table charts.

  • Customization polygon border control.

  • Allow measure date formatting for table charts.

  • Extend y-axis on Vega combo charts to end at the next whole value past the highest data point.

User Experience Enhancements

  • Add layer visibility toggle to kebab dropdown on multi-layer raster charts.

  • Made unsaved changes modal less aggressive.

  • Custom Source Table Functions Browser

  • Don’t show unsaved warning modal after adding default filter set.

Connectivity

  • (BETA) PostgreSQL connector.

  • Allows maxBounds to be set in servers.json.

Heavy Immerse - Fixed Issues

  • Toggle dashboard unsaved when updating annotations.

  • Dashboard save state behavior fixes.

  • Table Chart order by group keys when present.

  • Use key_for_string when ordering by known dictionary measures/dimensions.

  • Add default formatting for date/time on table chart.

  • Add admin feature flag to hide key manager.

  • Customizable polygon border color and existing border bug fixes.

  • Cannot append to table using PostgreSQL connector.

  • Building a raster chart with the layer visibility toggle feature flag enabled causes a crash.

6.0.0 - April 11, 2022

HeavyDB - New Features and Improvements

  • Support for fast string functions on dictionary-encoded text columns (the default), including LOWER, UPPER, INITCAP, TRIM/LTRIM/RTRIM, LPAD/RPAD, REVERSE, REPEAT, SUBSTRING/SUBSTR, REPLACE, OVERLAY, SPLIT_PART, REGEXP_REPLACE, REGEXP_SUBSTR, AND CONCAT (||). The output of these expressions can be chained, grouped-by, and used in both the left and right side of join predicates.

  • Support for fast string equality/inequality operations without the previous requirement of watchdog disablement when the two columns do not share dictionaries.

  • Support for fast case statements with multiple text column inputs that do not share dictionary-encoded strings.

  • Support for ENCODE_TEXT to encode none-encoded strings, which can then be grouped on and manipulated like dictionary-encoded strings. This operator is not intended for interactive use at scale but instead for ELT-like scenarios. Use the new server flag watchdog-none-encoded-string-translation-limit to set the upper cardinality allowed for such operations (1,000,000 by default).

  • Support for UNION ALL is enabled by default, and now works across string columns that do not share dictionaries with significantly better performance than in the previous release.

  • Window functions now support expressions in the PARTITION BY and ORDER clauses.

  • Support for subqueries in CASE statement clauses.

  • SHOW USER DETAILS is changed to only list those users with access to the currently-selected database. Previously, all users on the HeavyDB instance would be listed; this is still available to superusers with SHOW ALL USER DETAILS.

  • 10X improvements in initial join performance (including geo joins) through faster, parallelized hash table construction, removing redundant inter-thread hash table computation.

  • Improved join ordering to avoid loop joins in certain scenarios.

  • Parallel compilation of queries as well as inter-executor generated code increases concurrency and throughput in common, Immerse-driven scenarios by up to 20%. Also decreases latency for a single user interacting with dashboards or issuing SQL queries in a way that required new plans to be code-generated.

  • New result set recycler allows query substeps (expensive in subqueries) can be cached using SQL hints ( /*+ keep_result */), dramatically improving performance where the subquery is reused across multiple queries (for example, in Immerse) and only outer steps of the query vary.

  • The default for the header option of COPY TO to a CSV/TSV file has been changed from 'false' to 'true'.

  • Faster dictionary map in StringDictionaryProxy, accelerating various string operations involving transient entries.

  • Arrow execution endpoints now use multiple executors and can run concurrently like queries issued to the Thrift endpoints.

  • Addition of sparse dictionary output capability for Arrow queries, which automatically creates a subset of a string dictionary to send via Arrow when it detects that it is faster than sending the full, unfiltered dictionary. This provides orders-of-magnitude better server- and client-side performance and scalability for common cases where large dictionary-encoded text columns are filtered or top-k sorted such that only a small subset of dictionary entries are needed in the result set.

  • ST_INTERSECTS now can operate directly on top of compressed (the default) coordinates, leading to 2-3X increase in speed.

  • New table function framework allows for both system and user-defined table functions. Table functions can run on both CPU and GPU and are designed for efficient, scalable execution of custom algorithms in-situ on data that might be hard or impossible to implement in SQL.

  • Support for generate_series table function (similar to Postgres) for easy and fast integer series generation, particularly useful for left joins against binned tables to fill in gaps, whether for visualization or downstream operations like window functions, and generate_random_strings for generation of string columns of a user-defined size and cardinality.

  • Support for geo_rasterize and geo_rasterize_slope table functions to efficiently bin vector data into gap-free bins, with the optional ability to fill in null values, apply box blur, and compute slope and aspect ratios

  • Initial support for HeavyRF, a module that allows for real-time, ray-based computation of signal propagation, taking inputs of both terrain data and real or hypothetical signal sources.

  • Beta support for Python-defined scalar (row-level) and tabular User Defined Functions (UDFs and UDTFs), using the RBC library to translate Numba python code into LLVM IR that is JITed into query execution code for fast, scalable, custom user-defined capabilities.

  • Complete redesign and rewrite of Parquet import to one that is more robust, efficient, and performant.

  • Adds support for import from regex parsed files on either the server file system or S3 using the COPY FROM command.

  • The geo and parquet WITH options of COPY FROM have been deprecated and replaced by source_type. Using the deprecated syntax generates the following: Deprecation Warning: COPY FROM WITH (geo='true') is deprecated. Use WITH (source_type='geo_file') instead. Update any scripts you have to replace the deprecated syntax with the new syntax. For more information, see CSV/TSV Import.

  • (BETA) Adds support for import from RDMS/data warehouses using the COPY FROM command.

  • Adds system table support.

  • A new default information_schema database contains 10 new system tables that provide information regarding CPU/GPU memory utilization, storage space utilization, database objects, and database object permissions.

  • New system dashboards that enable intuitive visualization of system resource utilization and user roles and permissions.

  • Support for Zarr and NetCDF raster file import.

  • You can now import raster files with ground control points geospatial references.

  • Support for file path filtering, globbing, and sorting when importing geo and raster files.

  • Improved error messaging when attempting to save a dashboard that uses a duplicate dashboard name.

HeavyConnect

  • Support for connections to delimited files on either the server file system or S3. S3 support includes an option to use the S3 Select API, which provides better performance but with limitations on supported column types.

  • Support for connections to Parquet files on either the server file system or S3. HeavyConnect leverages Parquet metadata to provide efficient data access and row group-level filter push down.

  • Parquet column type coercion. Convert Parquet column types to more memory-efficient HeavyDB column types for use cases that guarantee no loss of information.

  • Connections to regex parsed files on either the server file system or S3. This enables you to query unstructured text files, such as logs, by specifying regular expression patterns that extract components of the text files into table columns.

  • (BETA) Support for connections to Relational Database Management Systems and Data Warehouses, leveraging the Open Database Connectivity (ODBC) interface to provide seamless access to data.

  • (BETA) ODBC column type coercion. Use HeavyConnect to convert ODBC column types to more memory-efficient HeavyDB column types for use cases that guarantee no loss of information.

  • Support for scheduled data refreshes. Specify a start date time and interval at which connected data gets refreshed.

  • Adds support for disk level caching. By default, data fetched by HeavyConnect are cached at the disk level in addition to normal CPU/GPU level caching. This provides better overall query performance for network based connections, such as S3, and systems with limited CPU/GPU memory capacity. Disk cache size and level can be set through HeavyDB server configuration.

  • Adds support for file path filtering, globbing, and sorting for Parquet, delimited, and regex parsed file use cases.

  • Complete redesign and rewrite of the Parquet detect_column_types Thrift API. The Parquet detect/data preview feature is now more robust, efficient, and performant.

HeavyDB - Fixed Issues

  • Change to query interrupt mechanism allowing certain classes of queries such as loop joins to be easily and quickly interrupted.

  • Fixed crash that could occur with joins on predicates that had functions on the left hand side expression, i.e. geoToH3.

  • Fix crash that could occur with Arrow queries that did not return results.

  • Avoid building metadata for empty result sets.

  • Fixes a crash that can occur when executing queries on GPU that involve a baseline group by and variable length column projections.

  • Fixes some table query concurrency bottlenecks. Previously, queries such as INSERT, TRUNCATE, and DROP TABLE required system wide locks to execute and would therefore block execution of other unrelated queries. These kinds of queries can now be executed concurrently.

  • Fixes a crash that can occur on server restart when the disk cache is enabled and tables with cached data are deleted.

  • Fixes a crash that can occur when the max_rows table option is altered for an empty table.

  • Fixes an issue in the JDBC driver where tables from multiple databases are listed even when a single database is specified.

  • Fixes an issue where raster POINT column type import would incorrectly throw an exception.

  • Fixes a crash that can occur when restoring a dump for a table with previously deleted columns.

  • Updates the export COPY TO command to include headers by default.

  • Removes the file_type parameter from the create_table Thrift API. This parameter was not used.

  • Fixes a crash that can occur when executing SQL commands containing comments.

  • Fixed the setting for default database (DEFAULT_DB) being ignored in a SAML login for a user who already exists.

HeavyRender - New Features and Improvements

  • The OpenGL renderer driver has been fully removed as of this release. Vulkan is the only available driver and enables a more modern, flexible API. As a result, the renderer-use-vulkan-driver program option has been removed. Remove any references to that program option from your configuration files. For more on the move to the Vulkan driver, see Vulkan Renderer.

  • A novel polygon rendering algorithm is now used as the default when rendering polygons. This algorithm does no triangulation nor does it require “render groups” (a hidden column to assist the old polygon rendering algorithm). However, the render groups column is still added on import as a fallback. See Importing Geospatial Data for more on render group deprecation.

  • You can now hit-test certain render queries with subqueries more effectively. For example, if the subquery is only used for filter predicates, renders should now be sped up and hit-testing more flexible.

HeavyRender - Fixed Issues

  • Render times are now being logged correctly (“render_vega-COMPLETED nonce:2 Total Execution: (ms), Total Render: (ms)”). The execution time and render time were incorrectly logged as 0 in Releases 5.9 and 5.10

  • Fixes a regression introduced in Release 5.10.0 when hit-testing an Immerse cohort-generated query. The hit-test would result in an error such as the following: “Cannot find column in hit-test cache for query …”

  • Resolves a crash when trying to hit-test render queries with window functions or cursorless table functions.

  • Fixes an issue where a multi-layer, multi-GPU render with a poly or line mark as the first layer can result in ghosting artifacts if the query associated with that layer resulted in 0 rows.

  • Fixes an issue when switching between a density accumulation scale with an auto-computed range (via min/max/+-1stStdDev/+-2ndStdDev) to a scale with an explicitly defined range. In this case, the explicit case was not reflected.

  • Removes a legacy constraint that prevented you from rendering a query that referenced one or more tables with more than one polygon/multipolygon column.

Heavy Immerse - New Features and Improvements

  • Improved speed of server interface using the Thrift binary protocol.

  • Data Manager has been redesigned to support HeavyConnect via S3, server file uploads, and expanded raster file support.

  • Introduced the new Gauge chart type.

  • Introduced a Welcome Panel and Help Center menu.

  • Rebranded interface for HEAVY.AI. Updated styles for the default dark and light themes.

  • Added option to toggle the legend on the New Combo chart.

  • Added configuration option for setting the default chart type.

  • Added configuration option for hiding specified chart types.

  • Added auto-selection of geo columns and measures on geo chart types.

  • Adjusted maximum bins for larger Top-N groups.

  • Added support for cross-domain configuration without SSL.

  • BETA: Added filter support for global custom expressions.

  • BETA: Introduced the new iframe chart type.

  • BETA: Introduced Arrow transport protocol for a limited number of chart types.

Heavy Immerse - Fixed Issues

  • Fixed various minor UI and performance issues.

  • Fixed parameter creation from dashboard title in Safari browser.

  • Fixed displaying of the Jupyter logo when integration is unavailable.