HEAVY.AI Docs
v8.1.0
v8.1.0
  • Welcome to HEAVY.AI Documentation
  • Overview
    • Overview
    • Release Notes
  • Installation and Configuration
    • System Requirements
      • Hardware Reference
      • Software Requirements
      • Licensing
    • Installation
      • Free Version
      • Installing on Docker
        • HEAVY.AI Installation using Docker on Ubuntu
      • Installing on Ubuntu
        • HEAVY.AI Installation on Ubuntu
        • Install NVIDIA Drivers and Vulkan on Ubuntu
      • Installing on Rocky Linux / RHEL
        • HEAVY.AI Installation on RHEL
        • Install NVIDIA Drivers and Vulkan on Rocky Linux and RHEL
      • Getting Started on AWS
      • Getting Started on GCP
      • Getting Started on Azure
      • Getting Started on Kubernetes (BETA)
      • Upgrading
        • Upgrading HEAVY.AI
        • Upgrading from Omnisci to HEAVY.AI 6.0
        • CUDA Compatibility Drivers
      • Uninstalling
      • Ports
    • Services and Utilities
      • Using Services
      • Using Utilities
    • Executor Resource Manager
    • Configuration Parameters
      • Overview
      • Configuration Parameters for HeavyDB
      • Configuration Parameters for HEAVY.AI Web Server
      • Configuration Parameters for HeavyIQ
    • Security
      • Roles and Privileges
        • Column-Level Security
      • Connecting Using SAML
      • Implementing a Secure Binary Interface
      • Encrypted Credentials in Custom Applications
      • LDAP Integration
    • Distributed Configuration
  • Loading and Exporting Data
    • Supported Data Sources
      • Kafka
      • Using HeavyImmerse Data Manager
      • Importing Geospatial Data
    • Command Line
      • Loading Data with SQL
      • Exporting Data
  • SQL
    • Data Definition (DDL)
      • Datatypes
      • Users and Databases
      • Tables
      • System Tables
      • Views
      • Policies
      • Comment
    • Data Manipulation (DML)
      • SQL Capabilities
        • ALTER SESSION SET
        • ALTER SYSTEM CLEAR
        • DELETE
        • EXPLAIN
        • INSERT
        • KILL QUERY
        • LIKELY/UNLIKELY
        • SELECT
        • SHOW
        • UPDATE
        • Arrays
        • Logical Operators and Conditional and Subquery Expressions
        • Table Expression and Join Support
        • Type Casts
      • Geospatial Capabilities
        • Uber H3 Hexagonal Modeling
      • Functions and Operators
      • System Table Functions
        • generate_random_strings
        • generate_series
        • tf_compute_dwell_times
        • tf_feature_self_similarity
        • tf_feature_similarity
        • tf_geo_rasterize
        • tf_geo_rasterize_slope
        • tf_graph_shortest_path
        • tf_graph_shortest_paths_distances
        • tf_load_point_cloud
        • tf_mandelbrot*
        • tf_point_cloud_metadata
        • tf_raster_contour_lines; tf_raster_contour_polygons
        • tf_raster_graph_shortest_slope_weighted_path
        • tf_rf_prop_max_signal (Directional Antennas)
        • ts_rf_prop_max_signal (Isotropic Antennas)
        • tf_rf_prop
      • Window Functions
      • Reserved Words
      • SQL Extensions
      • HeavyIQ LLM_TRANSFORM
  • HeavyImmerse
    • Introduction to HeavyImmerse
    • Admin Portal
    • Control Panel
    • Working with Dashboards
      • Dashboard List
      • Creating a Dashboard
      • Configuring a Dashboard
      • Duplicating and Sharing Dashboards
    • Measures and Dimensions
    • Using Parameters
    • Using Filters
    • Using Cross-link
    • Chart Animation
    • Multilayer Charts
    • SQL Editor
    • Customization
    • Joins (Beta)
    • Chart Types
      • Overview
      • Bubble
      • Choropleth
      • Combo
      • Contour
      • Cross-Section
      • Gauge
      • Geo Heatmap
      • Heatmap
      • Linemap
      • Number
      • Pie
      • Pointmap
      • Scatter Plot
      • Skew-T
      • Table
      • Text Widget
      • Wind Barb
    • Deprecated Charts
      • Bar
      • Combo - Original
      • Histogram
      • Line
      • Stacked Bar
    • HeavyIQ SQL Notebook
  • HEAVYIQ Conversational Analytics
    • HeavyIQ Overview
      • HeavyIQ Guidance
  • HeavyRF
    • Introduction to HeavyRF
    • Getting Started
    • HeavyRF Table Functions
  • HeavyConnect
    • HeavyConnect Release Overview
    • Getting Started
    • Best Practices
    • Examples
    • Command Reference
    • Parquet Data Wrapper Reference
    • ODBC Data Wrapper Reference
    • Raster Data Wrapper Reference
  • HeavyML (BETA)
    • HeavyML Overview
    • Clustering Algorithms
    • Regression Algorithms
      • Linear Regression
      • Random Forest Regression
      • Decision Tree Regression
      • Gradient Boosting Tree Regression
    • Principal Components Analysis
  • Python / Data Science
    • Data Science Foundation
    • JupyterLab Installation and Configuration
    • Using HEAVY.AI with JupyterLab
    • Python User-Defined Functions (UDFs) with the Remote Backend Compiler (RBC)
      • Installation
      • Registering and Using a Function
      • User-Defined Table Functions
      • RBC UDF/UDTF Example Notebooks
      • General UDF/UDTF Tutorial Notebooks
      • RBC API Reference
    • Ibis
    • Interactive Data Exploration with Altair
    • Additional Examples
      • Forecasting with HEAVY.AI and Prophet
  • APIs and Interfaces
    • Overview
    • heavysql
    • Thrift
    • JDBC
    • ODBC
    • Vega
      • Vega Tutorials
        • Vega at a Glance
        • Getting Started with Vega
        • Getting More from Your Data
        • Creating More Advanced Charts
        • Using Polys Marks Type
        • Vega Accumulator
        • Using Transform Aggregation
        • Improving Rendering with SQL Extensions
      • Vega Reference Overview
        • data Property
        • projections Property
        • scales Property
        • marks Property
      • Migration
        • Migrating Vega Code to Dynamic Poly Rendering
      • Try Vega
    • RJDBC
    • SQuirreL SQL
    • heavyai-connector
  • Tutorials and Demos
    • Loading Data
    • Using Heavy Immerse
    • Hello World
    • Creating a Kafka Streaming Application
    • Getting Started with Open Source
    • Try Vega
  • Troubleshooting and Special Topics
    • FAQs
    • Troubleshooting
    • Vulkan Renderer
    • Optimizing
    • Known Issues and Limitations
    • Logs and Monitoring
    • Archived Release Notes
      • Release 6.x
      • Release 5.x
      • Release 4.x
      • Release 3.x
Powered by GitBook
On this page
  • Examples
  • Data Properties
  • name
  • format
  • Data Source
  • transform
  • enableHitTesting
Export as PDF
  1. APIs and Interfaces
  2. Vega
  3. Vega Reference Overview

data Property

Use the Vega data property to specify the visualization data sources by providing an array of one or more data definitions. A data definition must be an object identified by a unique name, which can be referenced in other areas of the specification. Data can be statically defined inline ("values":), can reference columns from a database table using a SQL statement ("SQL":), or can be loaded from an existing data set ("source":).

JSON format:

"data": [
  {
    "name": <dataID>,
    "format": {
      "type": "lines" | "polys",
      "coords": {
        "x": <array>
        "y": <array>
      }
      "layout": "interleaved" | "sequential"
    "values": <valueSet> | "SQL": <dataSource> | "source": <dataSource>,
    "transform": [
      {
        "type": "aggregate"
         "fields": ["string":"string"]
         "ops": ["keyword":"keyword"]
         "as": ["string":"string"]
      }  
  },
  {
     ...
  }
]

The data specification has the following properties:

Property

Data Type

Required

Description

string

X

User-assigned database table name.

string/object

How the data are parsed. polys and lines are the only supported format mark types and are for rendering purposes only. Use the single string "short form" for polygon and simple linestring renders. Use the JSON object "long form" to provide more information for rendering more complex line types.

string

Data source:

values: Embedded, static data values defined inline as JSON.

sql: A SQL query that loads the data.

string

An array of transforms to perform on the input data. The output of the transform pipeline then becomes the value of this data set. Currently, can only be used with source data set types.

boolean

If true, automatically adds rowid column(s) to the SQL statement, which is required for hit-testing using the get_result_row_for_pixel endpoint.

Examples

Load discrete x- and y column values using the values database table type:

vegaSpec = {
    width: 384,
    height: 564,
    data: [
        {
          name: "coordinates",
          values: [ {"x":0, "y":3}, {"x":1, "y":5} ],
    scales: [ ... elided ... ],
    marks: [ ... elided ... ]
};

Use the sql database table type to load latitude and longitude coordinates from the tweets_data database table:

vegaSpec = {
    width: 384,
    height: 564,
    data: [
        {
          name: "tweets",
          sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
        }
    ],
    scales: [ ... elided ... ],
    marks: [ ... elided ... ]
};

Use the source type to use the data set defined in the sql data section and perform aggregation transforms:

vegaSpec = {
      width: 384,
      height: 564,
      data: [
              {
                      name: "tweets",
                      sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
              },
              {
                      name: "tweets_stats",
                      source: "tweets",
                      transform: [
                              {
                                      type: "aggregate",
                                      fields: ["x", "x"],
                                      ops: ["min", "max"],
                                      as: ["minx", "maxx"]
                              }
                      ]
              },
      ],
      scales: [ ... elided ... ],
      marks: [ ... elided ... ]
}

Data Properties

name

format

The format property indicates that data preprocessing is needed before rendering the query result. If this property is not specified, data is assumed to be in row-oriented JSON format.

  • The "short form", where format is a single string, which must be either polys or lines. This form is used for all polygon rendering, and for fast ‘in-situ’ rendering of LINESTRING data.

  • The "long form", where format is an object containing other properties, as follows:

Format Property

Description

type

Marks property type:

coords

Applies to type: lines.

Specifies x and y arrays, which must both be the same size.

This permits column extraction pertaining to line rendering and place them in a rendering buffer. The coords property also dictates the ordering of points in the line.

Separate x- and y-array columns are also supported.

layout

(optional) Applies to type: lines.

Specifies how vertices are packed in the vertices column. All arrays must have the same layout:

  • interleaved: (default) All elements corresponding to a single vertex are ordered in adjacent pairs. For example, x0, y0, x1, y1, x2, y2.

  • sequential: All elements of the same axis are adjacent. For example, x0, x1, x2, y0, y1, y2.

For lines, each row in the query corresponds to a single line.

This lines format example of interleaved data renders ten lines, all of the same length.

"data": [
  {
    "name": "table",
    "sql": "select lineArrayTest.rowid as rowid, vertices, color from lineArrayTest order by color desc limit 10;",
    "format": {
      "type": "lines",
      "coords": {
        "x": ["vertices"],
        "y": [
          {"from": "vertices" }
        ]
      },
      "layout": "interleaved"
    }
  }
]

In this lines format example of sequential data, x only stores points corresponding to the x coordinate and y only stores points corresponding to the y coordinate. Make sure that columns only contain a single coordinate if using multiple columns in sequential layout.

"data": [
  {
    "name": "table",
    "sql": "select lineArrayTestSeq.rowid as rowid, x, y, color from lineArrayTestSeq order by color desc limit 10;",
    "format": {
      "type": "lines",
      "coords": {
        "x": ["x"],
        "y": ["y"]
      },
    "layout": "sequential"
    }
  }
],

The following example shows a fast "in-situ" LINESTRING format:

"data": [
  {
    "name": "table",
    "format": "lines",
    "sql": "SELECT rowid, linestring_column, ... FROM ..."
  }
]

The following example shows a polys format:

"data": [
  {
    "name": "polys",
    "format": "polys",
    "sql": "SELECT ... elided ..."
  }
]

Data Source

The database table source property key-value pair specifies the location of the data and defines how the data is loaded:

Key

Value

Description

source

String

Data is loaded from an existing data set.

sql

SQL statement

Data is loaded using a SQL statement.

values

JSON data

Data is loaded from static, key-value pair data definitions.

transform

Transforms process a data stream to calculate new aggregated statistic fields and derive new data streams from them. Currently, transforms are specified only as part of a source data definition. Transforms are defined as an array of specific transform types that are executed in sequential order. Each element of the array must be an object and must contain a type property. Currently, two transform types are supported: aggregate and formula.

Type

Description and Properties

aggregate

Performs aggregation operations on input data columns to calculate new aggregated statistic fields and derive new data streams from them. The following properties are required:

fields: An array of strings referencing columns from the sourced data table.

ops: An array of keyword strings and objects indicating the predefined operation to perform. For objects, the type property is required to name the type of the aggregation function. Supported operators:

  • count: The total count of data objects in the group.

  • countdistinct: The number of distinct values in an input data column; operates only on numeric or dictionary-encoded string columns.

  • distinct: An array of distinct values from an input data column; operates only on numeric or dictionary-encoded string columns.

  • max: The maximum field value.

  • mean / average / avg: The mean (average) field value.

  • median: The median of an input data column; operates only on numeric columns.

  • min: The minimum field value.

  • missing: The count of field values that are null or undefined.

  • quantile: An array of quantile separators; see https://en.wikipedia.org/wiki/Quantile. Operates only on numeric columns:

    • numQuantiles: The number of contiguous intervals to create; returns the separators for the intervals. The number of separators equals numQuantiles - 1. Range: 1-100. Default: 4

    • includeExtrema: Whether to include min and max values (extrema) in the resulting separator array. When true, the resulting array size is numQuantiles + 1. Values: true or false. Default: false

  • sum: The sum of field values.

  • stddev: The sample standard deviation of field values.

  • stddevp: The population standard deviation of field values.

  • valid: The count of field values that are not null nor undefined.

  • variance: The sample variance of field values.

  • variancep: The population variance of field values.

as: An array of strings used as output names of the operations for later reference.

formula

Evaluates a user-defined expression. The following properties are required:

as: A string used as an output name for later reference.

Note: Currently, expressions can only be performed against outputs (as values) from prior aggregate transforms.

enableHitTesting

If true, automatically adds rowid column(s) to the SQL statement where appropriate, enabling the data block for hit-testing using the get_result_row_for_pixel endpoint.

If false, the data block is not automatically hit-test enabled, and any later get_result_row_for_pixel calls return empty hit-test results.

If the enableHitTesting property is not present, the following legacy behavior is used as the default:

  • If the SQL statement represents a projection query, hit-testing is enabled if a rowid column is explicitly projected.

  • If the SQL statement represents an aggregate query, hit-testing is always enabled.

This legacy behavior will likely be deprecated and removed in an upcoming version of OmniSci. At that point, the enableHitTesting property will be required for activating hit-test support for the data.

PreviousVega Reference OverviewNextprojections Property

source: Name of an existing Vega data set to use as this data set’s source. Use in combination with a pipeline to derive new data. You can source only one existing data set.

The name property uniquely identifies a data set, and is used for reference by other Vega properties, such as the property.

This property is required for and mark types. The property has one of two forms:

<code></code>

<code></code>

You can use to convert distance in meters from a coordinate or point to a pixel size, and determine if a coordinate or point is located within a view defined by latitude and longitude. For more information, see .

expr: An expression string to be evaluated. Expressions currently support

See for more detailed examples.

Marks
Tutorial: Using Transforms
extention functions
OmniSci SQL Extensions
these operators and functions.
name
format
Data Source
transform
transform
enableHitTesting
Polys
Lines
polys
lines