Search…
Release Notes
Use of HEAVY.AI is subject to the terms of the OmniSci End User License Agreement (EULA).
The latest release of HEAVY.AI is 6.0.0.
Looking for documentation from a previous release?
As with any software upgrade, it is important that you back up your data before you upgrade HEAVY.AI. Each release introduces efficiencies that are not necessarily compatible with earlier releases of the platform. HEAVY.AI is never expected to be backward compatible.
For assistance during the upgrade process, contact HEAVY.AI support at [email protected] before you upgrade your system.

6.0.0 - April 11, 2022

HeavyDB - New Features and Improvements

  • Support for fast string functions on dictionary-encoded text columns (the default), including LOWER, UPPER, INITCAP, TRIM/LTRIM/RTRIM, LPAD/RPAD, REVERSE, REPEAT, SUBSTRING/SUBSTR, REPLACE, OVERLAY, SPLIT_PART, REGEXP_REPLACE, REGEXP_SUBSTR, AND CONCAT (||). The output of these expressions can be chained, grouped-by, and used in both the left and right side of join predicates.
  • Support for fast string equality/inequality operations without the previous requirement of watchdog disablement when the two columns do not share dictionaries.
  • Support for fast case statements with multiple text column inputs that do not share dictionary-encoded strings.
  • Support for ENCODE_TEXT to encode none-encoded strings, which can then be grouped on and manipulated like dictionary-encoded strings. This operator is not intended for interactive use at scale but instead for ELT-like scenarios. Use the new server flag watchdog-none-encoded-string-translation-limit to set the upper cardinality allowed for such operations (1,000,000 by default).
  • Support for UNION ALL is enabled by default, and now works across string columns that do not share dictionaries with significantly better performance than in the previous release.
  • Window functions now support expressions in the PARTITION BY and ORDER clauses.
  • Support for subqueries in CASE statement clauses.
  • SHOW USER DETAILS is changed to only list those users with access to the currently-selected database. Previously, all users on the HeavyDB instance would be listed; this is still available to superusers with SHOW ALL USER DETAILS.
  • 10X improvements in initial join performance (including geo joins) through faster, parallelized hash table construction, removing redundant inter-thread hash table computation.
  • Improved join ordering to avoid loop joins in certain scenarios.
  • Parallel compilation of queries as well as inter-executor generated code increases concurrency and throughput in common, Immerse-driven scenarios by up to 20%. Also decreases latency for a single user interacting with dashboards or issuing SQL queries in a way that required new plans to be code-generated.
  • New result set recycler allows query substeps (expensive in subqueries) can be cached using SQL hints ( /*+ keep_result */), dramatically improving performance where the subquery is reused across multiple queries (for example, in Immerse) and only outer steps of the query vary.
  • The default for the header option of COPY TO to a CSV/TSV file has been changed from 'false' to 'true'.
  • Faster dictionary map in StringDictionaryProxy, accelerating various string operations involving transient entries.
  • Arrow execution endpoints now use multiple executors and can run concurrently like queries issued to the Thrift endpoints.
  • Addition of sparse dictionary output capability for Arrow queries, which automatically creates a subset of a string dictionary to send via Arrow when it detects that it is faster than sending the full, unfiltered dictionary. This provides orders-of-magnitude better server- and client-side performance and scalability for common cases where large dictionary-encoded text columns are filtered or top-k sorted such that only a small subset of dictionary entries are needed in the result set.
  • ST_INTERSECTS now can operate directly on top of compressed (the default) coordinates, leading to 2-3X increase in speed.
  • New table function framework allows for both system and user-defined table functions. Table functions can run on both CPU and GPU and are designed for efficient, scalable execution of custom algorithms in-situ on data that might be hard or impossible to implement in SQL.
  • Support for generate_series table function (similar to Postgres) for easy and fast integer series generation, particularly useful for left joins against binned tables to fill in gaps, whether for visualization or downstream operations like window functions, and generate_random_strings for generation of string columns of a user-defined size and cardinality.
  • Support for geo_rasterize and geo_rasterize_slope table functions to efficiently bin vector data into gap-free bins, with the optional ability to fill in null values, apply box blur, and compute slope and aspect ratios
  • Initial support for HeavyRF, a module that allows for real-time, ray-based computation of signal propagation, taking inputs of both terrain data and real or hypothetical signal sources.
  • Beta support for Python-defined scalar (row-level) and tabular User Defined Functions (UDFs and UDTFs), using the RBC library to translate Numba python code into LLVM IR that is JITed into query execution code for fast, scalable, custom user-defined capabilities.
  • Complete redesign and rewrite of Parquet import to one that is more robust, efficient, and performant.
  • Adds support for import from regex parsed files on either the server file system or S3 using the COPY FROM command.
  • The geo and parquet WITH options of COPY FROM have been deprecated and replaced by source_type. Using the deprecated syntax generates the following: Deprecation Warning: COPY FROM WITH (geo='true') is deprecated. Use WITH (source_type='geo_file') instead. Update any scripts you have to replace the deprecated syntax with the new syntax. For more information, see CSV/TSV Import.
  • (BETA) Adds support for import from RDMS/data warehouses using the COPY FROM command.
  • Adds system table support.
  • A new default information_schema database contains 10 new system tables that provide information regarding CPU/GPU memory utilization, storage space utilization, database objects, and database object permissions.
  • New system dashboards that enable intuitive visualization of system resource utilization and user roles and permissions.
  • Support for Zarr and NetCDF raster file import.
  • You can now import raster files with ground control points geospatial references.
  • Support for file path filtering, globbing, and sorting when importing geo and raster files.
  • Improved error messaging when attempting to save a dashboard that uses a duplicate dashboard name.

HeavyConnect

  • Support for connections to delimited files on either the server file system or S3. S3 support includes an option to use the S3 Select API, which provides better performance but with limitations on supported column types.
  • Support for connections to Parquet files on either the server file system or S3. HeavyConnect leverages Parquet metadata to provide efficient data access and row group-level filter push down.
  • Parquet column type coercion. Convert Parquet column types to more memory-efficient HeavyDB column types for use cases that guarantee no loss of information.
  • Connections to regex parsed files on either the server file system or S3. This enables you to query unstructured text files, such as logs, by specifying regular expression patterns that extract components of the text files into table columns.
  • (BETA) Support for connections to Relational Database Management Systems and Data Warehouses, leveraging the Open Database Connectivity (ODBC) interface to provide seamless access to data.
  • (BETA) ODBC column type coercion. Use HeavyConnect to convert ODBC column types to more memory-efficient HeavyDB column types for use cases that guarantee no loss of information.
  • Support for scheduled data refreshes. Specify a start date time and interval at which connected data gets refreshed.
  • Adds support for disk level caching. By default, data fetched by HeavyConnect are cached at the disk level in addition to normal CPU/GPU level caching. This provides better overall query performance for network based connections, such as S3, and systems with limited CPU/GPU memory capacity. Disk cache size and level can be set through HeavyDB server configuration.
  • Adds support for file path filtering, globbing, and sorting for Parquet, delimited, and regex parsed file use cases.
  • Complete redesign and rewrite of the Parquet detect_column_types Thrift API. The Parquet detect/data preview feature is now more robust, efficient, and performant.

HeavyDB - Fixed Issues

  • Change to query interrupt mechanism allowing certain classes of queries such as loop joins to be easily and quickly interrupted.
  • Fixed crash that could occur with joins on predicates that had functions on the left hand side expression, i.e. geoToH3.
  • Fix crash that could occur with Arrow queries that did not return results.
  • Avoid building metadata for empty result sets.
  • Fixes a crash that can occur when executing queries on GPU that involve a baseline group by and variable length column projections.
  • Fixes some table query concurrency bottlenecks. Previously, queries such as INSERT, TRUNCATE, and DROP TABLE required system wide locks to execute and would therefore block execution of other unrelated queries. These kinds of queries can now be executed concurrently.
  • Fixes a crash that can occur on server restart when the disk cache is enabled and tables with cached data are deleted.
  • Fixes a crash that can occur when the max_rows table option is altered for an empty table.
  • Fixes an issue in the JDBC driver where tables from multiple databases are listed even when a single database is specified.
  • Fixes an issue where raster POINT column type import would incorrectly throw an exception.
  • Fixes a crash that can occur when restoring a dump for a table with previously deleted columns.
  • Updates the export COPY TO command to include headers by default.
  • Removes the file_type parameter from the create_table Thrift API. This parameter was not used.
  • Fixes a crash that can occur when executing SQL commands containing comments.
  • Fixed the setting for default database (DEFAULT_DB) being ignored in a SAML login for a user who already exists.

HeavyRender - New Features and Improvements

  • The OpenGL renderer driver has been fully removed as of this release. Vulkan is the only available driver and enables a more modern, flexible API. As a result, the renderer-use-vulkan-driver program option has been removed. Remove any references to that program option from your configuration files. For more on the move to the Vulkan driver, see Vulkan Renderer.
  • A novel polygon rendering algorithm is now used as the default when rendering polygons. This algorithm does no triangulation nor does it require “render groups” (a hidden column to assist the old polygon rendering algorithm). However, the render groups column is still added on import as a fallback. See Importing Geospatial Data for more on render group deprecation.
  • You can now hit-test certain render queries with subqueries more effectively. For example, if the subquery is only used for filter predicates, renders should now be sped up and hit-testing more flexible.

HeavyRender - Fixed Issues

  • Render times are now being logged correctly (“render_vega-COMPLETED nonce:2 Total Execution: (ms), Total Render: (ms)”). The execution time and render time were incorrectly logged as 0 in Releases 5.9 and 5.10
  • Fixes a regression introduced in Release 5.10.0 when hit-testing an Immerse cohort-generated query. The hit-test would result in an error such as the following: “Cannot find column in hit-test cache for query …”
  • Resolves a crash when trying to hit-test render queries with window functions or cursorless table functions.
  • Fixes an issue where a multi-layer, multi-GPU render with a poly or line mark as the first layer can result in ghosting artifacts if the query associated with that layer resulted in 0 rows.
  • Fixes an issue when switching between a density accumulation scale with an auto-computed range (via min/max/+-1stStdDev/+-2ndStdDev) to a scale with an explicitly defined range. In this case, the explicit case was not reflected.
  • Removes a legacy constraint that prevented you from rendering a query that referenced one or more tables with more than one polygon/multipolygon column.

Heavy Immerse - New Features and Improvements

  • Improved speed of server interface using the Thrift binary protocol.
  • Data Manager has been redesigned to support HeavyConnect via S3, server file uploads, and expanded raster file support.
  • Introduced the new Gauge chart type.
  • Introduced a Welcome Panel and Help Center menu.
  • Rebranded interface for HEAVY.AI. Updated styles for the default dark and light themes.
  • Added option to toggle the legend on the New Combo chart.
  • Added configuration option for setting the default chart type.
  • Added configuration option for hiding specified chart types.
  • Added auto-selection of geo columns and measures on geo chart types.
  • Adjusted maximum bins for larger Top-N groups.
  • Added support for cross-domain configuration without SSL.
  • BETA: Added filter support for global custom expressions.
  • BETA: Introduced the new iframe chart type.
  • BETA: Introduced Arrow transport protocol for a limited number of chart types.

Heavy Immerse - Fixed Issues

  • Fixed various minor UI and performance issues.
  • Fixed parameter creation from dashboard title in Safari browser.
  • Fixed displaying of the Jupyter logo when integration is unavailable.

5.10.2 - February 14, 2022

OmniSciDB - New Features and Improvements

  • The COPY TO command now exports time, date, and timestamp data types in ISO 8601 format. Previously, date and timestamp data types were exported as unix epochs.

OmniSciDB - Fixed Issues

  • Fixed crash that could occur when window function partitions were sorted on multiple columns.
  • Fixed crash that could occur when in-situ queries had query step punted to CPU.
  • Fixed crash that could occur when columnar output is forced for queries projecting variable length types, either by command-line flag or query hint.

OmniSci Render - Fixed Issues

  • Fixes hit-testing issues in single-node elated to rendering SQL with aggregate subqueries. Such queries could be generated using the cohort feature in Immerse or using custom dimensions. Previously, you could see errors such as the following in map charts: Cannot find column <column name> in hit-test cache for query <sql query>. Most errors of that form should be resolved in single-node.

OmniSci Immerse - Fixed Issues

  • Fixed New Combo Chart legend appearing over chart bars.
  • Fixed 'Failed to Load Tables' error when navigating to the data manager.

5.10.1 - January 25, 2022

OmniSci Immerse - Fixed Issues

  • Fixed issue where projections with more than 1M rows of geo and other variable-length types could crash the server when –columnar-large-projections is enabled.
  • Fixed issue with rendering results of certain subqueries when running distributed.
  • Fixed data manager error occuring when global custom expressions beta feature is enabled.
  • Upgrade log4j to 2.17.1 to address CVE-2021-44832.

5.10.0 - January 10, 2022

The SAML Entitlement feature is deprecated and may be removed in a future release. Row-level security (RLS), with a role specified through SAML, provides an improved replacement.

OmniSciDB - New Features and Improvements

Administration

  • Row-level security (RLS): Administrators can use new commands CREATE POLICY, SHOW POLICIES, and DROP POLICY to apply security filtering to queries run as a user or with a role.

Performance

  • Significantly more performant, parallelized window functions, executing up to 10X faster than in Release 5.9.
  • Automatic use of columnar output (instead of the default row-wise output) for large projections, particularly benefitting window functions, subqueries, and table function calls that returned large numbers of rows, lowering query times by 5-10X in some cases. The threshold defaults to 1M rows, but can be modified with thecolumnar-large-projections-threshold flag (and turned off with columnar-large-projections=false).
  • UNION ALL now works with tables that do not share dictionary columns and supports query patterns such as grouped inputs. Previously only projections were supported.
  • Large IN subqueries are now significantly faster and scalable by rewriting the associated query plans as decorrelated joins.

SQL Functionality

  • Added support for full set of ST_TRANSFORM SRIDs supported by geos/proj4 library.
  • Significant ST_DWithin speedups involving LINESTRING types via an optimization to clip the checked region of the LINESTRING to just the portion that can possibly fall within the specified distance of the other geometry.
  • Removes a previous restriction where null values were not allowed for text array columns.
  • Adds ANY_VALUE as canonical SQL alias for existing SAMPLE aggregate operator.

Geo Enhancements

  • Support for full set of ST_TRANSFORM SRIDs supported by geos/proj4 library.
  • High performance UTM transforms (ST_TRANSFORM) available to and from web mercator and geographic coordinate systems.
  • Significant ST_DWithin speedups involving LINESTRING types via an optimization to clip the checked region of the LINESTRING to just the portion that can possibly fall within the specified distance of the other geometry.

Import Improvements

  • Adds support for import from dozens of image and raster file types, such as jpeg, png, geotiff and ESRI grid, including remote files.
  • Adds support for numerous vector GIS files (100+ formats supported by current GDAL release).
  • Adds support for multidimensional array import from GRIB/GRIB2 formats common in science and meteorology.
When importing GRIB/GRIB2 raster files, the default multithreaded import behavior is unstable due to a bug in the GDAL library; this issue will be addressed in the next release. You must use the WITH option threads=1 to force single-threaded operation and avoid a server crash. This restriction does not apply to any other raster formats.
  • Adds support for selected layers or bands within multilayer files.
  • Adds a new flag to SQLImporter that allows narrowing of integer values.
  • Improves point import speed regardless of source.
  • Extended file metadata and column detection in omnisql to include additional formats.

OmniSciDB - Fixed Issues

  • Fixes an issue in the ODBC driver where some SQL statements containing double quotes would result in an error.
  • Fixes an issue where decimal value imports could fail, even when the specified number of maximum rejections has not been reached.
  • Fixes a crash that occurs when the max rows option is set for an empty table.

OmniSci Render - Fixed Issues

  • Fixes a potential ghosting artifact when rendering consecutive accumulation renders in a multi-gpu system using the OpenGL driver.

OmniSci Immerse - New Features and Improvements

  • Introduces dashboard-level Named Custom SQL.
  • Improves Table Chart export to access all data represented by a table chart.
  • Adds overridable CSS file (override.css) for custom CSS in embedded implementations.
  • Panning support in the New Combo chart with alt+scroll.
  • New color picker user interface and made other UI improvements.
  • Alerts for unsaved dashboard changes.
  • Geo-joined Choropleth charts now spatially cross-filter the fact table on zoom and pan.
  • Expanded functionality of the Iframe API (Beta).
  • Introduces optional higher data throughput Arrow Transport (Beta).
  • Expanded functionality for Embedded UI Customization (Beta).
  • Adds database-level Global Custom Expressions (Beta).
  • Adds cohort support for crosslinking (Beta).

OmniSci Immerse - Fixed Issues

  • Fixed application of Configuration UI changes to discretion charts’ legends.
  • Fixed custom SQL expression support for custom data sources.
  • Fixed various cases for upgrading legacy charts to the New Combo chart.
  • Fixed various minor UI issues.

5.9.0 - November 18, 2021

OmniSciDB - New Features and Improvements

  • Adds IF EXISTS support to DROP USER and DROP ROLE statements.
  • Significant speedup for POINT and fixed-length array imports and CTAS/ITAS--generally 5-20X faster.

OmniSciDB - Fixed Issues

  • Fixes an issue where null values and empty arrays were not imported correctly for text array columns.
  • Correctly account for compressed size of 8-bit and 16-bit dictionary encoded text columns and all geo type columns to avoid overly aggressive punting of queries to CPU when columns of these types are used in a query.
  • Prevent overflow of columns with widths less than 4 bytes when columnar projection used in a query, which could lead to a crash.

OmniSci Render - New Features and Improvements

  • Improves the performance of density accumulation stats gathering when using min/max/1stStdDev/2ndStdDev ranges, up to 50x speedups in some instances.
  • The PNG encoding step of a render request is no longer a blocking step, thereby improving render concurrency.

OmniSci Render - Fixed Issues

  • Fixes a potential timeout in the Vulkan renderer in a multi-GPU server configuration if GPU contention caused by many queries/renders executing at the same time exists. The timeout can manifest as a VulkanDeviceLost error that can lock up renders for a minute before the renderer can recover.
  • Fixes a minor memory leak when caching parsed Vega JSONs.
  • Fixes a possible ghosting artifact in multilayer/multi-GPU renders when individual layers use different subsets of available GPUs when rendered.

OmniSci Immerse - New Features and Improvements

  • BETA - Adds custom expressions to table columns.
  • BETA - Adds Crosslink feature with Crosslink Panel UI.
  • BETA - Adds Custom SQL Source support and Custom SQL Source Manager.
  • Adds support to hide deprecated chart types from add/edit chart menu.
  • Improved speed of user role verification.

OmniSci Immerse - Fixed Issues

  • Fixes minor issues in the Parameter manager UI.
  • Fixes New Combo chart binning migration.
  • Fixes issues in the global side navigation.
  • Fixes fetching a cohort count when a custom SQL filter contains a parameter.
  • Fixes Vega chart color-utils to always fetch latest color palettes.
  • Fixes filter removal after changing and resetting the default value if you use a SQL custom dimension as a parameter.
  • Fixes minor UI styling issues.
  • Fixes custom source selection in Vega source selector.
  • Fixes cross-filter support with geo-joined bounding boxes.
  • Fixes resetting of parameter definitions when canceling from inside the chart editor view.
  • Fixes new cases of chart title helper support.
  • Fixes New Combo Bar label automatic formatting.

5.8.1 - November 9, 2021

OmniSciDB - New Features

  • Added a new SHOW ROLES command for viewing directly assigned and effective user roles.

OmniSci Render - Fixed Issues

  • Fixed a regression in Release 5.8.0 where purging idle render sessions can result in server crashes with the OpenGL renderer.
  • Fixes a regression in Release 5.8.0 where poly renders can result in segfaults.

OmniSci Immerse - Fixed Issues

  • User role checks that block dashboard load and login are completed much more quickly.

5.8.0 - October 11, 2021

Release 5.8 officially enables the new Vulkan backend renderer as the default, replacing the OpenGL renderer. For more information on Vulkan, reasons for the change, and troubleshooting, see Vulkan Renderer.
The legacy OpenGL backend renderer is still available as a fallback, but will be deprecated and removed in subsequent releases, so it is advisable to use the legacy OpenGL renderer only if you have a blocking issue with Vulkan. To disable the Vulkan renderer and enable OpenGL, add the following flag to your server configuration:
renderer-use-vulkan-driver=false
In distributed clusters, the configuration file for each node must have this parameter set to use OpenGL.
Releases 5.7 and higher require the installation of a Vulkan API loader library to support the Vulkan renderer. This library is required regardless of whether you are using the Vulkan renderer or not. If you are upgrading from Release 5.7, you have already installed this library and no further action is required.
If you see an error similar to the following when trying to start a renderer-enabled Release 5.8 server for the first time, you need to install the Vulkan API loader:
error while loading shared libraries: libvulkan.so.1: cannot open shared object file: No such file or directory
To install the Vulkan API loader:
  • On CentOS: sudo yum install vulkan
  • On Ubuntu: sudo apt install libvulkan1
For a summary of how to install the Vulkan loader on other various Linux distributions, see https://linuxconfig.org/install-and-test-vulkan-on-linux.
For other troubleshooting issues, see Vulkan Renderer.

OmniSciDB - New Features and Improvements

  • Parallel executors now on by default (with default --num-executors=2)
  • Spatial joins between point types using ST_Distance type are now accelerated using overlaps join hash framework, with increased in speed up to 100x.
  • Window functions can run on empty partitions and can operate over tables with multiple fragments and shards.
  • Queries that need to run a query step on CPU due to memory pressure or compatibility reasons now execute only that individual step on CPU. Previously, the full query was restarted and all steps ran on CPU.
  • Support WIDTH_BUCKET operator for easier numeric binning.
  • Natively support ST_Transform to/from all UTM zones and EPSG:4326 (Lon/Lat) and EPSG:900913 (Web Mercator).
  • Support provided for file path regex filter and for file path sort order when running the COPY FROM command.
  • New ALTER SYSTEM CLEAR commands enable clearing CPU or GPU memory.
  • Error messages for DUMP/RESTORE commands are improved.
  • Validations and error messages are improved when specifying default column values.
  • More robust handling is added for long decimal strings during import.
  • Non-superusers can view and interrupt their own queries.
  • Rewrote query plans where certain aggregates are performed on the same expression that is grouped-by (that is, COUNT DISTINCT column where column is being grouped) to improve performance for these queries

OmniSciDB - Fixed Issues

  • Fixed an issue where window functions with a preceding filter would pull all of a table’s columns into memory, even if unused by the query.
  • Disabled window functions for updates because it could crash the server. Targeted to be fixed and enabled in a future release.
  • Various fixes related to aggregate window functions.
  • Improvements to auto-casting logic for extension functions, user-defined row functions, and table functions.
  • Avoid unnecessary translation between dictionary-encoded text columns for certain classes of hash joins when the inner and outer tables’ join columns share the same dictionary, thereby improving performance.
  • Fixed crash that could occur when performing a join on multifragment input when the hash table is built on GPU but the join is executed on CPU.
  • Various bug fixes and performance improvements for runtime query interrupt.
  • Fixed broken Immerse dashboard import in omnisql.

OmniSci Rendering - New Features and Improvements

  • The default renderer is now Vulkan. The legacy OpenGL renderer can still be used as a fallback. See the notice at the top of these release notes for more information.
  • Improved the memory footprint/performance of the multi-GPU compositor by doing overlapped, tiled transfers from render GPUs to the compositor GPU. The more GPUs on the node, the bigger potential gains.
  • Added extra logging in the event of a timeout or Vulkan device lost error.

OmniSci Immerse - New Features

  • Migrated numeric binning in Immerse to use the new WIDTH_BUCKET SQL operator, significantly improving binning/histogram performance in some cases.
  • BETA: Added a new global side navigation.
  • BETA: Added support for 3D terrain in 3D scatter chart.
  • BETA: Support for hiding Immerse UI elements in iframed app usage is added.
  • BETA: Crossfilter referencing in custom filters is supported.

OmniSci Immerse - Fixed Issues

  • Fixed an issue where the parameters dropdown would not autocomplete.
  • Fixed incorrect file headings in chart map export.
  • Fixed multilayer legend value for Choropleth chart with geo heat map.
  • Fixed display of long dashboard titles.
  • Fixed dashboard duplication permission case.
  • Fixed hit-test error on a custom measure.
  • Fixed SQL syntax error on Pointmap chart with custom SQL group-by dimension.
  • Fixed rendering of Scatter Plot and Pointmap charts after hovering over points.
  • Fixed various minor UI bugs affecting overall usability.

5.7.1 - September 14, 2021

OmniSci Immerse - Fixed Issues

  • Fixed SAML login to a specified database based on login URL with database name.
  • Fixed role retrieval for usernames with @ and other symbols ($ & + , ; : / = ?).

5.7.0 - August 26, 2021

OmniSciDB - New Features and Improvements

Query Capabilities
  • Added support for default column values. When creating tables or adding new columns to existing tables, you can now specify column default values. This works in both SQL and with Thrift APIs.
  • Added support for APPROX_QUANTILE, with performance and functionality similar to APPROX_MEDIAN, which is APPROX_QUANTILE called with a 50% quantile argument.
  • Per-kernel interrupt performance significantly improved, now on by default. Queries can be interrupted using Ctrl + C in omnisql, or by calling the interrupt API.
Performance
  • The Arrow data frame API now differentiates between where the query is run and where the Arrow result is requested. For example, a query that must run on CPU can be pushed to GPU for Arrow IPC sharing, and a query that would normally run on GPU can still run on GPU, even if the user requests the Arrow buffer in CPU memory via IPC sharing.
  • Improved performance of high-cardinality group-by queries. Large cohorts in Immerse should show substantial performance improvement.
  • Parallel executors now in public beta (set with --num-executors flag). In future versions of OmniSci, the number of executors will be set to 2 by default.
Administrative
  • Added a new REASSIGN OWNED command, which changes ownership of database objects (tables, views, dashboards, etc.) from a user or set of users in the current database to a different user.
  • Added a new SHOW USER DETAILS command for introspecting user information.
  • Error messages have been updated to remove redundancy and otherwise extraneous content.
Geospatial
  • Added support to ST_Transform for transformations from 900913 (web mercator) to 4326 (lon/lat).
  • ST_Contains and ST_Intersect joins and filters can now run fully accelerated on dynamically constructed points using ST_Point(lon, lat).
  • Geospatial projection support for points is now more robust, faster, and widely supported. For example, ST_SetSRID(ST_Point(id,id),4326) is now supported in projections.
  • Major enhancements to ODBC driver, including support for Geospatial types.
  • Added FlatGeoBuf import/export support, which is about 3x faster than shapefile and 8x faster than geojson.
  • Improved handling of geospatial columns in intermediate results and temporary tables. Previously, a "columnar conversion not supported" error would be thrown in joins involving multi-fragment geospatial tables. Now, the geospatial column can be zipped up, allowing joins and similar operations to proceed.
Enhanced AWS Permissions and Session Management
  • Added support for using IAM roles or server permissions when importing data from AWS S3. Admins can enable the use of IAM roles when running on an EC2 instance. Credentials can also be configured on the server either through AWS environment variables or credential files. Enable this option with the allow-s3-server-privileges server configuration.
  • Added support for AWS session tokens through omnisql and Thrift import APIs.

OmniSciDB - Fixed Issues

  • Improved messaging related to an error that can occur when the sample data insert script downloads data to a path outside the server data directory.
  • Fixed an issue where the server can encounter an error on startup due to a pre 4.0 release migration bug.
  • Fixed a race condition that can occur when SELECT queries and auto-vacuuming execute concurrently.
  • Projections without limits now should not re-compile if query literals change, significantly increasing performance
  • Arrow over-the-wire query requests were always being executed on CPU, now they will run on default device type (and can be overridden by the user with a query hint)
  • Fix crash that could occur with empty Arrow result sets
  • omnisql now returns error status codes if a single command is called and the command fails. This allows omnisql to be more easily embedded into scripts with error checking.
  • Resolved an issue where join table reordering could fail for some geojoins, resulting in an error message or crash.
  • Resolved a crash that occurred when inserting a NULL value into a geospatial column in distributed mode.
  • Various bug fixes and performance improvements for geospatial types in Insert Into As Select / Create Table As Select queries in distributed mode.
  • Resolved a crash that occurred when updating a BOOLEAN array column.
  • Resolved null handling issues when grouping by BIGINT columns.
  • Resolved an issue which prevented CURRENT_TIME() or NOW() from being used in aggregate queries.
  • Resolved an issue involving CASE statements with string literals in one of the case branches, where the returned results could be incorrect.
  • Comparison between full array and indexed array columns will throw an appropriate error. Previously, SQL with such a comparison could cause the server to crash.
  • Resolved an issue where NULLS LAST in an ORDER BY clause could cause an incorrect ordering with respect to sign (e.g. negatives before positives).
  • Resolved an issue where CTAS/ITAS queries with small returns in the SELECT statement could enter an infinite loop in distributed mode.
  • Resolved an issue preventing the owner of a database from dropping that database.

OmniSci Rendering - New Features and Improvements

Release 5.7 includes the official beta release of the new Vulkan-backed renderer, which will replace the current OpenGL renderer. For more information, see Vulkan and the reasons we are making the switch, see Vulkan Renderer (Beta).
Because Vulkan will be the default renderer in Release 5.8, OmniSci strongly recommends using it now. The OpenGL renderer will be deprecated and removed in subsequent releases. The Vulkan renderer is reliable and stable, and switching to Vulkan now can help reveal any unforeseen issues. Finding such issues early, while an easy fallback exists, ensures a smooth, less risky transition.
To enable the beta Vulkan renderer, set the renderer-use-vulkan-driver configuration parameter to true.
In distributed clusters, the configuration file for each node must have the parameter set.
The Vulkan library is required for Release 5.7, regardless of which renderer you use. If you do not install the renderer, you will see the following error when trying to start a renderer-enabled server for the first time:
error while loading shared libraries: libvulkan.so.1: cannot open shared object file: No such file or directory
If you receive this error, you need to install the Vulkan API loader:
  • CentOS: sudo yum install vulkan
  • Ubuntu: sudo apt install libvulkan1
For a summary of how to install the Vulkan loader on various Linux distributions, see: https://linuxconfig.org/install-and-test-vulkan-on-linux.
  • Significant speedups in large polygon renders across multiple GPUs. The larger the table, and the more GPUs, the bigger the speed increase.
  • Memory footprint improvements in compositing and anti-aliasing components.
  • 3-4x speedup with procedural rendering (i.e. “symbol” vega mark type)
  • Better pipelining of compressed geo columns and projection math; 2x memory footprint improvement in use cases
  • Added a new Airplane symbol shape for Vega that can be used in symbol/legacy symbol marks.

OmniSci Immerse - New Features and Improvements

  • Significant enhancements to Immerse parameters.
    • Dashboard parameter widgets:
      • Side-panel parameter controllers can be added to dashboards.
      • Supports sophisticated custom dashboards with no code required.
    • Parameter value display across the product:
      • Chart titles and chart axes labels
      • Dashboard titles
      • Tooltips and Legends
    • Convenience methods to use parameters within:
      • Chart column selectors
      • Simple filters
      • Quick filters
    • Enhanced parameter management:
      • Show/hide hidden parameters
      • Improved user interface for tracking parameters usage
      • Allow parameter usage in Demo mode
      • Improved parameter selection autocomplete in all chart types
  • Improvements in Geo charts:
    • Ability to set top color category in Pointmaps.
    • Improved gradient selection in Pointmaps
    • Support Zoom Level in Map Chart `Zoom to` Field
  • High-precision, higher-performance lasso tool:
    • Now uses an ST_Contains filter expression for drawn polygons instead of expensive inside-triangle expressions.
    • Employs a dynamic level-of-detail drawing algorithm that automatically adjusts the resolution of parts of a polygon/circle that are in view so that it matches the results of the ST_Contains filter exactly for map charts.
  • 3D Pointmap chart (Beta release).
  • Embed HTML in a text chart (Beta release).
  • Updated mapd-connector with latest version of Thrift.

OmniSci Immerse - Fixed Issues

  • Unique category support for larger unique category sets.
  • Apply numeric filters using parameters.
  • Fixed display of parameters on new dashboards after a session timeout.
  • Fixed Choropleth hit test when hovering over a polygon.
  • Fixed display of boolean values in the input field of a global filter and dropdown menu.
  • Fixed parameter support in Choropleth charts with a geo join.
  • Fixed reversed axes labels on New Combo chart.
  • Fixed BIGINT column support in cohort builder.
  • Fixed display of parameter value in filter component input.
  • Fixed boolean setting of column value parameters.
  • Upgrade charts with column parameters set as dimension.
  • Fixed errant session clearing for non-autologin instances.
  • Fixed missing support for map-move crossfilter feature flags in Combo chart.
  • Fixed imperial formatting in popups.
  • Fixed a multilayer popup visibility when hovering over a line or point.
  • Various minor UI improvements and bug fixes.

5.6.4 - July 19, 2021

OmniSci release 5.6.4 includes a fixed issue.

OmniSciDB

Fixed Issue

  • Resolved an issue where cleaning up an expired session could cause the server to crash.

5.6.3 - June 25, 2021

OmniSci release 5.6.3 includes a new feature and an improvement.

OmniSciDB

Improvement

  • ITAS and CTAS queries are now protected from entering an infinite loop when small numbers of records are returned.

OmniSci Immerse

New Feature

  • Support for encrypted credentials in connectors is decoupled from authentication through Immerse.

5.6.3 - June 25, 2021

OmniSci release 5.6.3 includes a new feature and an improvement.

OmniSciDB

Improvement

  • ITAS and CTAS queries are now protected from entering an infinite loop when small numbers of records are returned.

OmniSci Immerse

New Feature

  • Support for encrypted credentials in connectors is decoupled from authentication through Immerse.

5.6.2 - June 17, 2021

OmniSci release 5.6.2 includes new features and improvements and fixes several issues.

OmniSciDB

New Features and Improvements

  • Added a new 'airplane' symbol shape.
  • When using INSERT to insert floats into INT columns, the numbers are rounded instead of truncated; now consistent with COPY behavior.

Fixed Issues

  • Fixed an issue where UDFs created at server start time with the -udf flag were not generating code for the GPU.
  • Resolved an issue where certain queries with a smaller number of groups ( < 20,000) would fail with a “Ran out of slots in the query output buffer” message.
  • Reduced incidence of “Ran out of slots in the query output buffer” errors with large group-bys.
  • Improved READ-ONLY configuration to never write to underlying storage.
  • Resolved an issue where a left join could get incorrect results if IS NOT NULL was part of the WHERE clause for the query.
  • Resolved sort order with aggregated results (negatives before positive) when NULLS LAST is specified.
  • Fixed an issue causing a crash when ITAS/CTAS populates a NULL point geometry object.
  • Resolved and issue where COPY FROM was restricting all other access in a distributed setup
  • Improved AVG handling when NULL values are used.
  • Fixed a random segmentation fault that occurred when running render unit tests.

OmniSci Immerse

New Features and Improvements

Fixed Issues

  • Resolved an issue where the record count on a zoomed map would reset after editing.
  • Fixed return key support for input of column type parameters.
  • Fixed the resetting of column and default value after a parameter is created.
  • Fixed caching of the number chart with a chart level filter.
  • Resolved issue related to support of Booleans for column value parameters.
  • Fixed an issue affecting null column values in quantitative color palette scales for Heat charts.
  • Batch upgrading of charts includes additional edge cases.
  • Resolved issue related to support for custom color dimensions in New Combo charts.

5.6.1 - May 10, 2021

OmniSci release 5.6.1 includes a new feature and fixes several issues.

OmniSciDB

Fixed Issues

  • Fixed an issue where a rename of a user with ALTER USER USER RENAME TO USER_NEW could cause issues later when trying to restart the server, depending on role membership.
  • Fixed a system crash that occurred during an INSERT TABLE AS SELECT command on a table that does not exist.
  • Fixed an issue where polygons from a table which had been appended using SQLImporter might not render. Added command-line option --noPolyRenderGroups/-nprg to disable render group assignment, to match the available WITH option on other import workflows.
    .

OmniSci Immerse

New Feature

  • When basemap sources are not available--for example, if the user is on a private network with no external access--the user is set to offline mode and Immerse uses the lightweight map source bundled with Immerse.

Fixed Issue

  • Fixed an issue to make sure that Combo charts correctly plot on autobinning when the start-of-week flag is set to Saturday or Sunday.
  • Duplication of charts with an empty filter is now allowed.
    .

5.6.0 - April 21, 2021

OmniSci release 5.6.0 contains several new features and improvements and fixes a number of issues.

OmniSciDB

New Features and Improvements

  • Allowed import and export paths are now enforced. Confirm that any import or export paths are included via the configuration parameters allowed-export-paths and allowed-import-paths.
  • Loading data now more strictly enforces validation or numeric inputs. Previously, an out-of-range integer would “wrap” and be ingested as incorrect values. In Release 5.6, numerics are validated and if validation fails, the row is rejected.
  • String dictionaries do not load if you exceed their capacities; previously, they inserted null for the string.
  • Utility jar mapd-1.0-SNAPSHOT-jar-with-dependencies.jar is renamed. Scripts using SQLimporter need to reference the new .jar omnisci-utility-5.6.0.jar.
  • Significant speedups for point-in-polygon joins (ST_CONTAINS, ST_INTERSECTS) via optimized algorithms and range join hash framework; on by default for point-in-polygon joins.
  • Approximate median function support through the new approx_median aggregate function. Currently works only in nondistributed deployments and executes on CPU.
  • Query interrupt is now possible when the query is outside of core kernel execution (for example, copy/import, CTAS, ITAS, query data fetch, result reduction). On by default and controlled by new flag enable-non-kernel-time-query-interrupt.
  • Query interrupt during core kernel execution has been optimized so that enabling it with the existing enable-runtime-query-interrupt flag should have little to no impact on query performance; actual impact depends on the query pattern.
  • Added partial column specification for INSERT and INSERT FROM SELECT, such that data can be inserted and appended to a subset of columns in a table. NULL values are added to columns not specified in the statements.
  • Updated the CREATE TABLE command max_rollback_epochs option with a default value of 3. This reduces the number of epochs/data versions that are stored. When combined with new space reclamation on delete, optimizes space utilization. Setting the option with a high value stores more epochs and reduces data compaction and space reclamation. To update older tables, use the ALTER TABLE <table> SET MAX_ROLLBACK_EPOCHS=<value> command.
  • Added automatic metadata updates and vacuuming. On UPDATE and DELETE queries, OmniSciDB updates metadata, vacuums, and re-uses space as specified. Automatic metadata updates are turned on by default. See enable-auto-metadata-update in Configuration Parameters. Automatic vacuuming on delete occurs when deletes or updates on variable length columns exceed the vacuum threshold. For more information, see vacuum-min-selectivity in Configuration Parameters.
To ensure optimal space usage, ensure the table’s MAX_ROLLBACK_EPOCHS option is set to 3 or lower. This is automatically set by default for the CREATE TABLE command on new tables created in release 5.6 and higher.
For existing databases, OmniSci recommends using ALTER TABLE <table> SET MAX_ROLLBACK_EPOCHS=3 to cap the space usage, and OPTIMIZE TABLE [<table>] WITH (VACUUM='true'); to reclaim space. This primes the system for efficient space management.
  • Added RENAME TABLE command to enable one or more tables to be renamed at the same time.
  • OmniSciDB now enforces permitted filepaths for import/export operations by default. The default import and export paths ({data directory}/map_import and {data directory}/map_export respectively) are allowed by default, so no change is required when doing an import/export from Immerse or for sample data import on server startup. However, when using other commands with user-provided paths (such as COPY FROM) provided paths must be under an allowed root path. See the Configuration Parameters for more information.
    NOTE: For distributed systems, allowed paths only need to be in the configuration file on the aggregator. However, for consistency and future compatibility, OmniSci recommends that the allowed paths also be added to the leaf omnisci.conf file.
  • Significantly improved OmniSciDB startup time. Removed extra metadata seeks/reads during startup. Systems with high epoch counts or large column counts will have marked improvements.
  • Validation of numeric inputs is more strictly enforced on load. Previously an out-of-range integer would “wrap” and be ingested as some incorrect value. Numeric-typed columns are now validated, and if validation fails, the row is rejected.
  • String dictionary reaching capacity during import now throws an error and stops the load instead of inserting nulls in the dictionary-encoded column.
  • Significant performance improvements for load into high-cardinality dictionary-encoded text columns via optimized dictionary hash table resizing.
  • Improved load performance on systems with many cores. Default max number of threads to use is capped at 32, instead of 2X the number of cores, which can cause contention on systems with many cores.
  • Improved the performance of left join queries with filters by searching for left-hand side filters that can be executed before the join, potentially reducing the cardinality of the join.
  • Support date format %m/%d/%y with 2-digit year on data import.
  • Improved the performance of common multicolumn sort patterns.
  • Chunksize default increased to 2 GB to allow for more records per fragment, particularly when variable-length columns are present (none-encoded strings, arrays, and geospatial types). This allows certain query patterns, such as large joins that might previously span multiple fragments, to run faster.
  • Entitlements (introduce simple row level security via SAML attributes).
  • Accept session-token when using S3 import, WITH option is s3-session-token.
  • Upgrade to calcite 1.25.
  • Improved the accuracy of ST_Contains and ST_Intersects at small geospatial scales.
  • Allow Narrowing casts in SQLImporter.
  • Upgrade to sqlite 3.34.
  • Allow SAML and LDAP to provide authentication without having to also provide authorization.
  • Utility jar renamed and versioned: omnisci-utility-5.6.0-SNAPSHOT.jar.
  • Added support for CURRENT_TIMESTAMP, CURRENT_DATE and CURRENT_TIME functions
  • Improved READ-ONLY support.
  • UDTFs now allow for a ColumnList argument type, which allows a variable number of arguments to be used for inputs.
  • Improved the granularity of the session lock, allowing local users to access the system even if third-party authentication providers have failed or are experiencing high latency.
  • Many bug and performance improvements.

Fixed Issues

  • Fixed issue causing a crash when UDTFs were composed.
  • Fixed an issue where definition of CPU UDTFs could lead to all subsequent queries running on CPU.
  • Resolved an issue where selecting from a view with a self join could cause a crash.
  • Improved the ability of the parser to handle complex UPDATE and DELETE queries.
  • Added human-friendly error messages to UPDATE and DELETE queries.
  • Resolved an issue where specifying warmup queries would cause the aggregator to crash in distributed mode.
In distributed mode, specify warmup queries for the aggregator and all leaf nodes to work correctly.
  • Resolved an issue where an equijoin between array columns could crash. Now, an attempt to join throws an exception if the join is not supported.
  • Resolved an issue where window functions in a subquery could cause a crash.
  • Resolved an issue where window function aggregates could crash if used with filters.
  • Improved sorting performance of and resolved a number of stability and correctness issues.
  • Resolved an issue in distributed mode where subqueries over replicated tables could return duplicate results.
  • Resolved several potential deadlocks between the rendering engine and the query engine when multiple queries or types of queries are in process.
  • Fixed an issue where SHOW CREATE TABLE would show the wrong table name after a table with shared dictionaries was renamed.
  • Resolved an issue where a COPY TO command with a subquery could fail.

OmniSci Rendering

New Features and Improvements

  • Improved the quality of density accumulation rendering by incorporating a technique that uses 64-bit atomic operations instead of 32-bit when calculating standard deviation. This allows for a greater range of colors when using statistical measures to color the densities. The 32-bit technique was prone to uncaught overflows that resulted in a much more clamped color range.
  • Performance improvement when using a Mercator projection explicitly for marks using the Vega projection property.
  • Multi-GPU compositor performance and memory footprint is improved through better pipelining and reduced copy operations.
  • New Arrow stock symbol shape can be used in symbol/legacy symbol marks.
  • Line and polygon stroke rendering improved by 30-40% by simplifying the math to build the stroke geometry.
  • Significant improvements to the quality of point/symbol rendering, achieved by improvements to procedural anti-aliasing technique and to render sample capturing when drawing procedural symbols.
  • Polygon geometry imported via the SQLImporter path can now be rendered.

Fixed Issues