1 of 18

Vega

Vega is a visualization specification language that describes how to map your source data to your viewing area. By creating a JSON Vega specification structure, you define the data and the transformations to apply to the data to produce meaningful visualizations. The specification includes the geometric shape that represents your data, scaling properties that map the data to the visualization area, and graphical rendering properties.

HEAVY.AI uses Vega for backend rendering. Using the API, the client sends the Vega JSON to the backend, which renders the visualization and returns a PNG image for display. See the for backend rendering examples.

The topics in this guide define and describe the HEAVY.AI implementation of Vega, and provide examples you can use as a basis for your own visualizations:

- Introduces you to Vega specification patterns so you can start creating visualizations quickly and easily. Each tutorial has example code that demonstrates a particular feature or pattern. Tutorials start with basic Vega concepts and an introduction to the API for communication with the backend. Other tutorials provide more in-depth information about specific Vega implementations.
- Describes the HEAVY.AI implementation of Vega specification syntax and associated rules. Also includes links to Vega standards and related specifications.
- If you are upgrading to Release 5.2 or higher, you need to migrate any code that renders polygons in cached mode to dynamic poly rendering.
- Try the HEAVY.AI Vega engine and work with various examples. See your changes to Vega code in real time.

Quick Links

Additional Resources

Standards

Vega Concepts

OmniSci APIs

Vega Tutorials

These tutorials introduce you to common Vega specification patterns so you can start creating visualizations quickly and easily. Each tutorial uses a code example that demonstrates a particular Vega feature or pattern. The Getting Started with Vega tutorial covers basic Vega concepts and serves as the foundation for some of the tutorials that follow, and introduces you to the API for communication with the backend. Other tutorials provide more in-depth information about specific Vega implementations.

Use these tutorials to gain a better understanding of Vega by experimenting with them to create new visualizations on your own HEAVY.AI system and database. You can also Try Vega to make adjustments to Vega code and see real-time changes in charts.

For information about the Vega specification syntax and properties, see Vega Reference.

Tutorial Framework

Because the tutorials focus on the Vega specification, they use a simple client implementation that sends the render request to the HEAVY.AI server and handles the response:

Common index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>HEAVY.AI</title>
    <meta charset="UTF-8">
  </head>
  <body>
    <script src="js/browser-connector.js"></script>
    <script src="js/vegaspec.js"></script>
    <script src="js/vegademo.js"></script>

    <script>
    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

Common vegademo.js

function init() {
    var conn = new MapdCon()
        .protocol("http")
        .host("my.host.com")
        .port("6273")
        .dbName("heavyai")
        .user("heavyai")
        .password("changeme")
        .connect(function(error, con) {
          con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
            if (error) {
              console.log(error.message);
            }
            else {
              var blobUrl = `data:image/png;base64,${result.image}`
              var body = document.querySelector('body')
              var vegaImg = new Image()
              vegaImg.src = blobUrl
              body.append(vegaImg)
            }
          });
        });
}

The renderVega() function sends the exampleVega JSON structure described in the tutorials. Getting Started with Vega covers Vega library dependencies and the renderVega() function in more detail.

Finding the Cause of Errors

On a connection error, you can view the error message to determine the cause of the error. To determine the cause of Vega specification errors, catch and handle the renderVega() exception.

Available Tutorials

Vega at a Glance - Provides an overiew of Vega and a simple example to visualize tweets.
Getting Started with Vega - Maps a continuous, quantitative input domain to a continuous output range. Uses the same visualization as Vega at a Glance, but elaborates on the runtime environment and implementation steps.
Getting More from Your Data - Builds on the Getting Started with Vega tutorial by color-coding tweets according to language.
Creating More Advanced Charts - Introduces Symbol Type marks by creating a heatmap visualization of political donations.
Using the Poly Marks Type - Shows how to use the Polys Type marks, which uses an implicit polygon data table format. The visualization in the tutorial is a map of zip codes, color-coded according to average political contribution amount.
Vega Accumulator - Describes the three modes of accumulation rendering and provides some implementation examples. The data used contains information about political donations, including party affiliation, the amount of the donation, and location of the donor.
Using Transform Aggregation - Shows how to create Vega-based visualizations with render properties that are driven by aggregated statistics. Use Vega transform aggregation and formula expressions to automate the process of gathering statistical information about a rendered query.
Improving Rendering with SQL Extensions - Describes how to use SQL extension functions in Vega to map meters to pixels and improve map rendering.

Vega at a Glance

Source code is located at the end of this topic.

This tutorial provides an overiew of Vega and a simple example to visualize tweets in the EMEA geographic region:

Defining the Source Data

The Vega JSON structure maps data to geometric primitives.

A first task is to specify the data source. You can either define data statically or use a SQL query. This examples uses a SQL query to get tweet geolocation information from a tweets database:

SELECT goog_x as x, goog_y as y, tweets_nov_feb.rowid FROM tweets_nov_feb

The resulting SQL columns can be referenced in other parts of the specification to drive visualization elements. In this example, the projected columns are goog_x and goog_y, which are renamed x and y, and rowid, which is a requirement for hit-testing.

Creating a Visualization using Vega

The Vega specification for this example includes the following top-level properties:

height and width, which define the height and width of the visualization area.
data, which defines the data source. The SQL data described above is defined here with the label tweets for later referencing.
marks, which describes the geometric primitives used to render the visualization.
scales, which are referenced by marks to map input domain values to appropriate output range values.

Here is the full Vega specification used in this example:

const exampleVega = {
  "width": 384,
  "height": 564,
  "data": [
    {
      "name": "tweets",
      "sql": "SELECT goog_x as x, goog_y as y, tweets_nov_feb.rowid FROM tweets_nov_feb"
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "linear",
      "domain": [
        -3650484.1235206556,
        7413325.514451755
      ],
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "domain": [
        -5778161.9183506705,
        10471808.487466192
      ],
      "range": "height"
    }
  ],
  "marks": [
    {
      "type": "points",
      "from": {
        "data": "tweets"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": "blue",
        "size": {
          "value": 3
        }
      }
    }
  ]
};

The following sections describe the top-level Vega specification properties.

Define the Visualization Area Dimensions

The width and height properties define a visualization area 384 pixels wide and 564 pixels high:

"width": 384
"height": 564

The scales position encoding properties map the marks into this visualization area.

Define the Marks

The marks property defines visualization geometric primitives. The OmniSci Vega implementation defines the following primitive types:

lines A line
points A point
polys A polygon
symbol A geometric symbol, such as a circle or square

Each primitive type has a set of properties that describe how the primitive is positioned and styled.

This example uses points to represent the tweets data:

"marks": [
  {
    "type": "points",
    "from": {
      "data": "tweets"
    },
    "properties": {
      "x": {
        "scale": "x",
        "field": "x"
      },
      "y": {
        "scale": "y",
        "field": "y"
      },
      "fillColor": "blue",
      "size": {
        "value": 3
      }
    }
  }
]

Points support the following properties; not all are included in the example:

x The x position of the point in pixels.
y The y position of the point in pixels.
z The depth coordinate of the point in pixels.
fillColor The color of the point.
fillOpacity The opacity of the fill, from transparent (0) to opaque (1).
opacity The opacity of the point as a whole, from transparent (0) to opaque (1).
size The diameter of the point in pixels.

The points in the example reference the tweets SQL data and use the x and y columns from the SQL to drive the position of the points. The positions are appropriately mapped to the visualization area using scales as described in Scale Input Domain to Output Range. The fill color is set to blue and point size is set to three pixels.

Scale Input Domain to Output Range

The scales definition maps data domain values to visual range values, where the domain property determines the input domain for the scale. See the d3-scale reference for background information about how scaling works.

This example uses linear scales to map mercator-projected coordinates into pixel coordinates for rendering.

"scales": [
  {
    "name": "x",
    "type": "linear",
    "domain": [
      -3650484.1235206556,
      7413325.514451755
    ],
    "range": "width"
  },
  {
    "name": "y",
    "type": "linear",
    "domain": [
      -5778161.9183506705,
      10471808.487466192
    ],
    "range": "height"
  },
]

The x and y scales use linear interpolation to map point x- and y-coordinates to the width and height of the viewing area. The width and height properties are predefined keywords that equate to the range [0, <current width>] and [0, <current height>].

After completing the Vega specification, you send the JSON structure to the backend for rendering.

Connecting to the Server and Rendering the Visualization

The following steps summarize the rendering and visualization sequence:

Instantiate the MapdCon object for connecting to the backend.
Call the connect method with server information, user credentials, and data table name.
Provide the renderVega() callback function to connect() and include the Vega specification as a parameter.
Display the returned PNG image in you client browser window.

OmniSci uses Apache Thrift for cross-language client communication with the backend. Include the browser-connector.js, connector API, which includes Thrift interface libraries and the renderVega() function:

<script src="js/browser-connector.js"></script>

The following example encapsulates the connect, render request, and response handling sequence:

var vegaOptions = {}
var connector = new MapdCon()
  .protocol("http")
  .host("my.host.com")
  .port("6273")
  .dbName("omnisci")
  .user("omnisci")
  .password("changeme")
  .connect(function(error, con) {
    con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
      if (error) {
        console.log(error.message);
      }
      else {
        var blobUrl = `data:image/png;base64,${result.image}`
        var body = document.querySelector('body')
        var vegaImg = new Image()
        vegaImg.src = blobUrl
        body.append(vegaImg)
      }
    });
  });

Next Steps

This example demonstrated the basic concepts for understanding and using Vega. To become comfortable with Vega, try this example using your own OmniSci instance, changing the MapdCon() parameters according to match your host environment and database.

As you gain experience with Vega and begin writing your own applications, see the Reference for detailed information about Vega code.

Source Code

HTML

Vega at a Glance index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>OmniSci</title>
    <meta charset="UTF-8">
    <style>
      .title {
        font-weight: bold;
        text-align:center;
      }
      .omnisci {
        position: relative;
        top: 2px;
      }
      .search{
        displa"y": inline-block;
        margin-top: 12px;
        margin-left: 50px;
      }
      .data-count {
        padding-right:20px;
      }
      .filter-count {
        font-weight: bold;
        color:  #45B1E8;
      }
    </style>
  </head>
  <body>

    <script src="js/browser-connector.js"></script>

    <script>

    function init() {
      const exampleVega = {
        "width": 384,
        "height": 564,
        "data": [
          {
            "name": "tweets",
            "sql": "SELECT goog_x as x, goog_y as y, tweets_data_table.rowid FROM tweets_data_table"
          }
        ],
        "scales": [
          {
            "name": "x",
            "type": "linear",
            "domain": [
              -3650484.1235206556,
              7413325.514451755
            ],
            "range": "width"
          },
          {
            "name": "y",
            "type": "linear",
            "domain": [
              -5778161.9183506705,
              10471808.487466192
            ],
            "range": "height"
          }
        ],
        "marks": [
          {
            "type": "points",
            "from": {
              "data": "tweets"
            },
            "properties": {
              "x": {
                "scale": "x",
                "field": "x"
              },
              "y": {
                "scale": "y",
                "field": "y"
              },
              "fillColor": "blue",
              "size": {
                "value": 3
              }
            }
          }
        ]
      };

       var vegaOptions = {}
       var connector = new MapdCon()
         .protocol("http")
         .host("my.host.com")
         .port("6273")
         .dbName("omnisci")
         .user("omnisci")
         .password("changeme")
         .connect(function(error, con) {
           con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
             if (error) {
               console.log(error.message);
             }
             else {
               var blobUrl = `data:image/png;base64,${result.image}`
               var body = document.querySelector('body')
               var vegaImg = new Image()
               vegaImg.src = blobUrl
               body.append(vegaImg)
             }
           });
         });
    }

    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

Getting Started with Vega

Source code is located at the end of the tutorial.

This tutorial uses the same visualization as Vega at a Glance but elaborates on the runtime environment and implementation steps. The Vega usage pattern described here applies to all Vega implementations. Subsequent tutorials differ only in describing more advanced Vega features.

This visualization maps a continuous, quantitative input domain to a continuous output range. Again, the visualization shows tweets in the EMEA region, from a tweets data table:

Backend rendering using Vega involves the following steps:

Create the Vega Specification
Connect to the Backend
Make the Render Request and Handle the Response

You can create the Vega specification statically, as shown in this tutorial, or programmatically. See the Poly Map with Backend Rendering charting example for a programmatic implementation. Here is the programmatic source code:

Step 1 - Create the Vega Specification

A Vega JSON specification has the following general structure:

const exampleVega = {
  width: <numeric>,
  height: <numeric>,
  data: [ ... ],
  scales: [ ... ],
  marks: [ ... ]
};

Specify the Visualization Area

The width and height properties define the width and height of your visualization area, in pixels:

const exampleVega = {
  width: 384,
  height: 564,
  data: [ ... ],
  scales: [ ... ],
  marks: [ ... ]
};

Specify the Data Source

This example uses the following SQL statement to get the tweets data:

data: [
    {
        "name": "tweets",
        "sql": "SELECT goog_x as x, goog_y as y, tweets_nov_feb.rowid FROM tweets_nov_feb"
    }
]

The input data are the latitude and longitude coordinates of tweets from the tweets_nov_feb data table. The coordinates are labeled x and y for Field Reference in the marks property, which references the data using the tweets name.

Specify the Graphical Properties of the Rendered Data Item

The marks property specifies the graphical attributes of how each data item is rendered:

marks: [
    {
        type: "points",
        from: {
            data: "tweets"
        },
        properties: {
            x: {
                scale: "x",
                field: "x"
            },
            y: {
                scale: "y",
                field: "y"
            },
            "fillColor": "blue",
            size: {
                value: 3
            }
        }
    }
]

In this example, each data item from the tweets data table is rendered as a point. The points marks type includes position, fill color, and size attributes. The marks property specifies how to visually encode points according to these attributes. Points in this example are three pixels in diameter and colored blue.

Points are scaled to the visualization area using the scales property.

Specify How Input Data are Scaled to the Visualization Area

The following scales specification maps marks to the visualization area.

scales: [
    {
        name: "x",
        type: "linear",
        domain: [
            -3650484.1235206556,
            7413325.514451755
        ],
        range: "width"
      },
      {
        name: "y",
        type: "linear",
        domain: [
            -5778161.9183506705,
            10471808.487466192
        ],
        range: "height"
    }
]

Both x and y scales specify a linear mapping of the continuous, quantitative input domain to a continuous output range. In this example, input data values are transformed to predefined width and height range values.

Later tutorials show how to specify data transformation using discrete domain-to-range mapping.

Step 2 - Connect to the Backend

Use the browser-connector.js renderVega() API to communicate with the backend. The connector is layered on Apache Thrift for cross-language client communication with the server.

Follow these steps to instantiate the connector and to connect to the backend:

Include browser-connector.js located at https://github.com/omnisci/mapd-connector/tree/master/dist to include the MapD connector and Thrift interface APIs.
```
<script src="<localJSdir>/browser-connector.js"></script>
```
Instantiate the MapdCon() connector and set the server name, protocol information, and your authentication credentials, as described in the MapD Connector API:
```
var vegaOptions = {}
var connector = new MapdCon()
  .protocol("http")
  .host("my.host.com")
  .port("6273")
  .dbName("omnisci")
  .user("omnisci")
  .password("HyperInteractive")
```
Property
Description
dbName
OmniSci database name.
host
OmniSci web server name.
password
OmniSci user password.
port
OmniSci web server port
protocol
Communication protocol: http, https
user
OmniSci user name.
Finally, call the MapD connector API connect() function to initiate a connect request, passing a callback function with a (error, success) signature as the parameter.

For example,

.connect(function(error, con) { ... });

The connect() function generates client and session IDs for this connection instance, which are unique for each instance and are used in subsequent API calls for the session.

On a successful connection, the callback function is called. The callback function in this example calls the renderVega() function.

Step 3 - Make the Render Request and Handle the Response

The MapD connector API renderVega() function sends the Vega JSON to the backend, and has the following parameters:

.connect(function(error, con) {
  con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
    if (error)
      console.log(error.message);
    else {
      var blobUrl = `data:image/png;base64,${result.image}`
      var body = document.querySelector('body')
      var vegaImg = new Image()
      vegaImg.src = blobUrl
      body.append(vegaImg)
    }
  });
});

Parameter

Type

Required

Description

widgetid

number

Calling widget ID.

vega

string

options

number

Render query options.

compressionLevel:PNG compression level. 1 (low, fast) to 10 (high, slow). Default = 3

callback

function

Callback function with (error, success) signature.

Return

Description

Base64 image

PNG image rendered on server

The backend returns the rendered Base64 image in results.image, which you can display in the browser window using a data URI.

Source Code

Getting Started Directory Structure

 index.html
 /js
   browser-connector.js
   vegaspec.js
   vegademo.js

HTML

Getting Started index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>OmniSci</title>
    <meta charset="UTF-8">
  </head>
  <body>
    <script src="js/browser-connector.js"></script>
    <script src="js/vegaspec.js"></script>
    <script src="js/vegademo.js"></script>

    <script>
    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

JavaScript

Getting Started vegademo.js

function init() {
  var vegaOptions = {}
  var connector = new MapdCon()
    .protocol("http")
    .host("my.host.com")
    .port("6273")
    .dbName("omnisci")
    .user("omnisci")
    .password("changeme")
    .connect(function(error, con) {
      con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
        if (error) {
          console.log(error.message);
        }
        else {
          var blobUrl = `data:image/png;base64,${result.image}`
          var body = document.querySelector('body')
          var vegaImg = new Image()
          vegaImg.src = blobUrl
          body.append(vegaImg)
        }
      });
    });
}

Getting Started vegaspec.js

const exampleVega = {
  "width": 384,
  "height": 564,
  "data": [
    {
      "name": "tweets",
      "sql": "SELECT goog_x as x, goog_y as y, tweets_data_table.rowid FROM tweets_data_table"
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "linear",
      "domain": [
        -3650484.1235206556,
        7413325.514451755
      ],
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "domain": [
        -5778161.9183506705,
        10471808.487466192
      ],
      "range": "height"
    }
  ],
  "marks": [
    {
      "type": "points",
      "from": {
        "data": "tweets"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": "blue",
        "size": {
          "value": 3
        }
      }
    }
  ]
};

Getting More from Your Data

can be found at the end of this tutorial.

This tutorial builds on the tutorial by color-coding tweets according to language:

Tweets in English are blue.
Tweets in French are orange.
Tweets in Spanish are green.
All other tweets are light or dark gray.

To highlight language in the visualization, the example specifies the language column query in the Vega data property, and associates language with color.

"data:" [
  {
    "name": "tweets",
    "sql": "SELECT goog_x as x, goog_y as y, lang as color, tweets_nov_feb.rowid FROM tweets_nov_feb"
  }

"scales:" [
     .
     .
     .
  {
    "name": "color",
    "type": "ordinal",
    "domain": ["en",  "es", "fr"],
    "range": ["#27aeef", "#87bc45", "#ef9b20"],
    "default": "gray",
    "nullValue": "#cacaca"
  }
]

You can specify a default color values for values not specified in range and for data items with a value of null. In this example, tweets in languages other than English, Spanish, or French are colored gray and tweets with a language value of null are colored light gray (#cacaca).

In previous examples the fill color of points representing tweets was statically specified as blue:

"marks:" [
  {
    "type:" "points",
    "from:" {
      "data:" "tweets"
    },
    "properties:" {
         .
         .
         .
      },
      "fillColor": "blue",
      "size:" {"value:" 3}
    }
  }
]

"marks:" [
  {
    "type:" "points",
    "from:" {
      "data:" "tweets"
    },
    "properties:" {
         .
         .
         .
      },
      "fillColor:" {
        "scale": "color",
        "field": "color"
      },
      "size:" 3
    }
  }
]

The fillColor references the color scale and performs a lookup on the current language value, from the color data table field.

Source Code

Getting More Insight Tutorial Directory Structure

index.html
/js
  browser-connector.js
  vegaspec.js
  vegademo.js

HTML

Getting More Insight Tutorial index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>OmniSci</title>
    <meta charset="UTF-8">
  </head>
  <body>
    <script src="js/browser-connector.js"></script>
    <script src="js/vegaspec.js"></script>
    <script src="js/vegademo.js"></script>

    <script>
    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

JavaScript

Getting More Insight Tutorial vegademo.js

function init() {
  var vegaOptions = {}
  var connector = new MapdCon()
    .protocol("http")
    .host("my.host.com")
    .port("6273")
    .dbName("omnisci")
    .user("omnisci")
    .password("changeme")
    .connect(function(error, con) {
      con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
        if (error) {
          console.log(error.message);
        }
        else {
          var blobUrl = `data:image/png;base64,${result.image}`
          var body = document.querySelector('body')
          var vegaImg = new Image()
          vegaImg.src = blobUrl
          body.append(vegaImg)
        }
      });
    });
}

Getting More Insight Tutorial vegaspec.js

const exampleVega = {
  "width": 384,
  "height": 564,
  "data": [
    {
      "name": "tweets",
      "sql": "SELECT goog_x as x, goog_y as y, lang as color, tweets_data_table.rowid FROM tweets_data_table"
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "linear",
      "domain": [
        -3650484.1235206556,
        7413325.514451755
      ],
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "domain": [
        -5778161.9183506705,
        10471808.487466192
      ],
      "range": "height"
    },
    {
      "name": "color",
      "type": "ordinal",
      "domain": ["en",  "es", "fr"],
      "range": ["#27aeef", "#87bc45", "#ef9b20"],
      "default": "gray",
      "nullValue": "#cacaca"
    }
  ],
  "marks": [
    {
      "type": "points",
      "from": {
        "data": "tweets"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": {
          "scale": "color",
          "field": "color"
        },
        "size": 3
      }
    }
  ]
};

Creating More Advanced Charts

is located at the end of this tutorial.

This tutorial introduces you to marks by creating a heatmap visualization. The heatmap shows contribution level to the Republican party within the continental United States:

The contribution data are obtained using the following SQL query:

"data": [
  {
   "name": "heatmap_query",
   "sql": "SELECT rect_pixel_bin(conv_4326_900913_x(lon), -13847031.457875465, -7451726.712679257, 733, 733) as x,
           rect_pixel_bin(conv_4326_900913_y(lat), 2346114.147993467, 6970277.197053557, 530, 530) as y,
           SUM(amount) as cnt
           FROM contributions
           WHERE (lon >= -124.39000000000038 AND lon <= -66.93999999999943) AND
               (lat >= 20.61570573311549 AND lat <= 52.93117449504004) AND
               amount > 0 AND
               recipient_party = 'R'
           GROUP BY x, y"
  }
]

The visualization uses a Symbol Type marks type to represent each data item in the heatmap_query data table:

"marks": [
    {
        "type": "symbol",
        "from": {
            "data": "heatmap_query"
        },
        "properties": { ... elided ... }
    }
]

The marks properties property specifies the symbol shape, which is a square. Each square has a pixel width and height of one pixel.

"marks": [
    {
        ... elided ...

        "properties": {
            "shape": "square",
            "x": {
                "field": "x"
            },
            "y": {
                "field": "y"
            },
            "width": 1,
            "height": 1,
            "fillColor": {
                "scale": "heat_color",
                "field": "cnt"
            }
        }
    }
]

Notice that the data x and y location values do not reference a scale. The location values are the values of the SQL query, transformed using extension functions.

The fill color of the square uses the heat_color scale to determine the color used to represent the data item.

Quantize scales are similar to linear scales, except they use a discrete rather than continuous range. The continuous input domain is divided into uniform segments based on the number of values in the output range.

"scales": [
    {
        "name": "heat_color",
        "type": "quantize",
        "domain": [
            10000.0,
            1000000.0
        ],
        "range": [ "#0d0887", "#2a0593", "#41049d", "#5601a4", "#6a00a8",
                   "#7e03a8", "#8f0da4", "#a11b9b", "#b12a90", "#bf3984",
                   "#cb4679", "#d6556d", "#e16462", "#ea7457", "#f2844b",
                   "#f89540", "#fca636", "#feba2c", "#fcce25", "#f7e425", "#f0f921"
        ],
        "default": "#0d0887",
        "nullValue": "#0d0887"
    }
]

A heatmap shows a continuous input domain divided into uniform segments based on the number of values in the output range. This is a quantize scales type. In the example, dollar amounts between $10,000 and $1 million are uniformly divided among 21 range values, where the larger amounts are represented by brighter colors.

Values outside the domain and null values are rendered as dark blue, #0d0887.

Source Code

Advanced Chart Type Tutorial Directory Structure

index.html
/js
  browser-connector.js
  vegaspec.js
  vegademo.js

HTML

Advanced Chart Type Tutorial index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>OmniSci</title>
    <meta charset="UTF-8">
  </head>
  <body>
    <script src="js/browser-connector.js"></script>
    <script src="js/vegaspec.js"></script>
    <script src="js/vegademo.js"></script>

    <script>
    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

JavaScript

Advanced Chart Type Tutorial vegademo.js

function init() {
  var vegaOptions = {}
  var connector = new MapdCon()
    .protocol("http")
    .host("my.host.com")
    .port("6273")
    .dbName("omnisci")
    .user("omnisci")
    .password("changeme")
    .connect(function(error, con) {
      con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
        if (error) {
          console.log(error.message);
        }
        else {
          var blobUrl = `data:image/png;base64,${result.image}`
          var body = document.querySelector('body')
          var vegaImg = new Image()
          vegaImg.src = blobUrl
          body.append(vegaImg)
        }
      });
    });
}

Advanced Chart Type Tutorial vegaspec.js

const exampleVega = {
  "width": 733,
  "height": 530,
  "data": [
    {
      "name": "heatmap_query",
      "sql": "SELECT rect_pixel_bin(conv_4326_900913_x(lon), -13847031.457875465, -7451726.712679257, 733, 733) as x,
                     rect_pixel_bin(conv_4326_900913_y(lat), 2346114.147993467, 6970277.197053557, 530, 530) as y,
                     SUM(amount) as cnt
                     FROM contributions
                     WHERE (lon >= -124.39000000000038 AND lon <= -66.93999999999943) AND
                           (lat >= 20.61570573311549 AND lat <= 52.93117449504004) AND
                           amount > 0 AND
                           recipient_party = 'R'
                           GROUP BY x, y"
    }
  ],
  "scales": [
    {
      "name": "heat_color",
      "type": "quantize",
      "domain": [
        10000.0,
        1000000.0
      ],
      "range": [ "#0d0887", "#2a0593", "#41049d", "#5601a4", "#6a00a8",
                 "#7e03a8", "#8f0da4", "#a11b9b", "#b12a90", "#bf3984",
                 "#cb4679", "#d6556d", "#e16462", "#ea7457", "#f2844b",
                 "#f89540", "#fca636", "#feba2c", "#fcce25", "#f7e425", "#f0f921"
      ],
      "default": "#0d0887",
      "nullValue": "#0d0887"
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {
        "data": "heatmap_query"
      },
      "properties": {
        "shape": "square",
        "x": {
          "field": "x"
        },
        "y": {
          "field": "y"
        },
        "width": 1,
        "height": 1,
        "fillColor": {
          "scale": "heat_color",
          "field": "cnt"
        }
      }
    }
  ]
};

Using Polys Marks Type

Source code can be found at the end of this tutorial.

This tutorial introduces you to the Polys Type marks, which uses an implicit polygon data table format. The visualization is a map of zip codes color-coded according to average contribution amount. The data table encodes polygons representing zip code areas.

See the Poly Map with Backend Rendering charting example for a programmatic rendering of this visualization.

The following data property extracts the average contribution amount from the contributions_donotmodify data table, omitting rows that do not have a contribution amount:

"data": [
  {
    "name": "polys",
    "format": "polys",
    "sql": "SELECT zipcodes.rowid,AVG(contributions_donotmodify.amount) AS avgContrib FROM contributions_donotmodify,zipcodes WHERE contributions_donotmodify.amount IS NOT NULL AND contributions_donotmodify.contributor_zipcode = zipcodes.ZCTA5CE10 GROUP BY zipcodes.rowid ORDER BY avgContrib DESC"
  }
]

When working with polygon data, the "format": "polys" property must be specified.

The scales specification scales x values to the visualization area width and y values to the height. A color scale, polys_fillColor is also specified that linearly scales nine contribution amount ranges to nine colors:

"scales": [
  {
    "name": "x",
    "type": "linear",
    "domain": [-19646150.75527339, 19646150.755273417],
    "range": "width"
  },
  {
    "name": "y",
    "type": "linear",
    "domain": [-3071257.58106188, 10078357.267122284],
    "range": "height"
  },
  {
    "name": "polys_fillColor",
    "type": "linear",
    "domain": [0, 325, 650, 975, 1300, 1625, 1950, 2275, 2600],
    "range": ["#115f9a", "#1984c5", "#22a7f0", "#48b5c4", "#76c68f", "#a6d75b", "#c9e52f", "#d0ee11", "#d0f400"],
    "default": "green"
  }
]

Zip code areas for which average contribution amounts are not specified by the domain are color-coded green.

The marks property specifies visually encoding the data from the polys data table as polygons:

"marks": [
  {
    "type": "polys",
    "from": { "data": "polys" },

       ... elided ...

    }
  }
]

Polygon x and y vertex locations are transformed to the visualization area using the x and y scales.

"marks": [
  {
       ... elided ...

    "properties": {
      "x": {
        "scale": "x",
        "field": "x"
      },
      "y": {
        "scale": "y",
        "field": "y"
      },
       ... elided ...
    }
  }
]

The x and y polygon vertex locations are implicitly encoded in the data table as described in Polys Type.

Polygon fill color color-codes the average contribution amount, avgContrib, linearly scaled by the polys_fillColor scale:

"marks": [
  {
       ... elided ...

    "properties": {

       ... elided ...

      "fillColor": {
        "scale": "polys_fillColor",
        "field": "avgContrib"
      },

       ... elided ...

    }
  }
]

Finally, the marks property specifies the polygon border width and color, and line join constraints:

"marks": [
  {
       ... elided ...

    "properties": {

       ... elided ...

      "strokeColor": "white",
      "strokeWidth": 0,
      "lineJoin": "miter",
      "miterLimit": 10
    }
  }
]

Source Code

Working with Polys Tutorial Directory Structure

index.html
/js
  browser-connector.js
  vegaspec.js
  vegademo.js

HTML

Working with Polys Tutorial index.html

<!DOCTYPE html>
<html lang="en">
  <head>
    <title>OmniSci</title>
    <meta charset="UTF-8">
  </head>
  <body>
    <script src="js/browser-connector.js"></script>
    <script src="js/vegaspec.js"></script>
    <script src="js/vegademo.js"></script>

    <script>
    document.addEventListener('DOMContentLoaded', init, false);
    </script>
  </body>
</html>

JavaScript

Working with Polys Tutorial vegademo.js

function init() {
  var vegaOptions = {}
  var connector = new MapdCon()
    .protocol("http")
    .host("my.host.com")
    .port("6273")
    .dbName("omnisci")
    .user("omnisci")
    .password("changeme")
    .connect(function(error, con) {
      con.renderVega(1, JSON.stringify(exampleVega), vegaOptions, function(error, result) {
        if (error) {
          console.log(error.message);
        }
        else {
          var blobUrl = `data:image/png;base64,${result.image}`
          var body = document.querySelector('body')
          var vegaImg = new Image()
          vegaImg.src = blobUrl
          body.append(vegaImg)
        }
      });
    });
}

Working with Polys Tutorial vegaspec.js

const exampleVega = {
  "width": 1004,
  "height": 336,
  "data": [
    {
      "name": "polys",
      "format": "polys",
      "sql": "SELECT zipcodes.rowid,AVG(contributions_donotmodify.amount) AS avgContrib FROM contributions_donotmodify,zipcodes WHERE contributions_donotmodify.amount IS NOT NULL AND contributions_donotmodify.contributor_zipcode = zipcodes.ZCTA5CE10 GROUP BY zipcodes.rowid ORDER BY avgContrib DESC"
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "linear",
      "domain": [
        -19646150.75527339,
        19646150.755273417
      ],
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "domain": [
        -3071257.58106188,
        10078357.267122284
      ],
      "range": "height"
    },
    {
      "name": "polys_fillColor",
      "type": "linear",
      "domain": [
        0, 325, 650, 975,
        1300, 1625, 1950, 2275, 2600
      ],
      "range": [
        "#115f9a", "#1984c5", "#22a7f0", "#48b5c4",
        "#76c68f", "#a6d75b", "#c9e52f", "#d0ee11", "#d0f400"
      ],
      "default": "green",
      "nullValue": "#CACACA"
    }
  ],
  "marks&quoquot;: [
    {
      "type": "polys",
      "from": {
        "data": "polys"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": {
          "scale": "polys_fillColor",
          "field": "avgContrib"
        },
        "strokeColor": "white",
        "strokeWidth": 0,
        "lineJoin": "miter",
        "miterLimit": 10
      }
    }
  ]
};

Vega Accumulator

Accumulation works by aggregating data per pixel during a backend render. Data is accumulated for every pixel for every shape rendered. Accumulation rendering is activated through color scales – scales that have colors defined for their range.

Note: Currently, only the COUNT aggregation function is supported.

This topic describes accumulation rendering and provides some implementation examples. The data source used here – a table called contributions – contains information about political donations made in the New York City area, including party affiliation, the amount of the donation, and location of the donor.

There are three accumulation modes:

Density Mode

Density accumulation performs a count aggregation by pixel. It allows you to color a pixel by normalizing the count and applying a color to it, based on a color scale. In Heavy Immerse, if you open or create a Pointmap chart, you can toggle density accumulation on and off by using the Density Gradient attribute. For more information, see .

Note: Blend and percentage accumulation are not currently available in Heavy Immerse.

The density mode examples use the following base code:

{
  "width": 714,
  "height": 535,
  "data": [
        {
          "name": "table",
          "sql": "SELECT conv_4326_900913_x(lon) as x,  conv_4326_900913_y(lat) as y,amount,rowid FROM contributions WHERE (conv_4326_900913_x(lon) between -8274701.640628147 and -8192178.083370286) AND (conv_4326_900913_y(lat) between 4946220.843530051 and 5008055.72186748) LIMIT 2000000",
          "dbTableName": "contributions"
        }
  ],
  "scales": [
        {
          "name": "x",
          "type": "linear",
          "domain": [-8274701.640628147,-8192178.083370286],
          "range": "width"
        },
        {
          "name": "y",
          "type": "linear",
          "domain": [4946220.843530051,5008055.72186748],
          "range": "height"
        },
        {
          "name": "pointcolor",
          "type": "linear",
          "domain": [100,10000],
          "range": ["blue","red"],
          "clamp": true
        }
  ],
  "marks": [
        {
          "type": "points",
          "from": {"data": "table"},
          "properties": {
                "x": {"scale": "x","field": "x"},
                "y": {"scale": "y","field": "y"},
                "fillColor": {"scale": "pointcolor","field": "amount"},
                "size": {"value": 2}
          }
        }
  ]
}

This code generates the following image:

All points are rendered with a size of 2 and colored according to the contribution amount:

$100 or less is colored blue.
$10,000 or more is colored red.
Anything in between is colored somewhere between blue and red, depending on the contribution. Amounts closer to $100 are more blue, and amounts closer to $10,000 are more red.

The examples that follow adjust the pointcolor scale and show the effects of various adjustments. Any changes made to Vega code are isolated to that scale definition.

Density accumulation can be activated for any scale that takes as input a continuous domain (linear, sqrt, pow, log, and threshold scales) and outputs a color range. In the following code snippet, the density accumulator has been added to the linear pointcolor scale:

{
  "name": "pointcolor",
  "type": "linear",
  "domain": [0.0,1.0],
  "range": ["blue","red"],
  "clamp": true,
  "accumulator": "density",
  "minDensityCnt": 1,
  "maxDensityCnt": 100
}

The final color at a pixel is determined by normalizing the per-pixel aggregated counts and using that value in the scale function to calculate a color. The domains of density accumulation scales should be values between 0 and 1 inclusive, referring to the normalized values between 0 and 1. The normalization is performed according to the minDensityCnt and maxDensityCnt properties. After normalization, minDensityCnt refers to 0 and maxDensityCnt refers to 1 in the domain. In this case, 0 in the domain equates to a per-pixel count of 1, and 1 in the domain equates to a per-pixel count of 100.

minDensityCnt and maxDensityCnt are required properties. They can have explicit integer values, or they can use keywords that automatically compute statistical information about the per-pixel counts. Currently available keywords are:

min
max
1stStdDev
2ndStdDev
-1stStdDev
-2ndStdDev

If you change the color scale to the following:

{
  "name": "pointcolor",
  "type": "linear",
  "domain": [0.01,1.0],
  "range": ["blue","red"],
  "clamp": true,
  "accumulator": "density",
  "minDensityCnt": "min",
  "maxDensityCnt": "max"
}

The minimum aggregated count of all the pixels is used as the minDensityCnt, and the maximum aggregated count used as the maxDensityCnt. This results in the following:

Notice that the area with the most overlapping points is in the upper east side of Manhattan.

Now, use +/- 2 standard deviations for your counts:

{
  "name": "pointcolor",
  "type": "linear",
  "domain": [0.01,1.0],
  "range": ["blue","red"],
  "clamp": true,
  "accumulator": "density",
  "minDensityCnt": "-2ndStdDev",
  "maxDensityCnt": "2ndStdDev"
}

This produces the following:

In this example, the scale is changed to a threshold scale, and the colors are adjusted to create a more interesting image:

{
  "name": "pointcolor",
  "type": "threshold",
  "domain": [0.111, 0.222, 0.333, 0.444, 0.555, 0.666, 0.777, 0.888],
  "range": ["rgba(17,95,154,1)", "rgba(25,132,197,1)", "rgba(34,167,240,1)", "rgba(72,181,196,1)", "rgba(118,198,143,1)", "rgba(166,215,91,1)", "rgba(201,229,47,1)", "rgba(208,238,17,1)", "rgba(208,244,0,1)"],
  "clamp": true,
  "accumulator": "density",
  "minDensityCnt": "-2ndStdDev",
  "maxDensityCnt": "2ndStdDev"
}

This results in:

Note: You can mix and match explicit values and keywords for minDensityCnt and maxDensityCnt. However, if your min value is greater than your max value, your image might look inverted.

Blend Mode

Blend accumulation works only with ordinal scales. This accumulation type blends the per-category colors set by an ordinal scale so that you can visualize which categories are more or less prevalent in a particular area.

The following Vega code colors the points according to the value in the recipient_party column:

{
  "width": $width,
  "height": $height,
  "data": [
        {
          "name": "table",
          "sql": "SELECT conv_4326_900913_x(lon) as x,  conv_4326_900913_y(lat) as y,recipient_party,rowid FROM contributions WHERE (conv_4326_900913_x(lon) between $minXBounds and $maxXBounds) AND (conv_4326_900913_y(lat) between $minYBounds and $maxYBounds) LIMIT 2000000"
        }
  ],
  "scales": [
        {
          "name": "x",
          "type": "linear",
          "domain": [
                $minXBounds,
                $maxXBounds
          ],
          "range": "width"
        },
        {
          "name": "y",
          "type": "linear",
          "domain": [
                $minYBounds,
                $maxYBounds
          ],
          "range": "height"
        },
        {
          "name": "pointcolor",
          "type": "ordinal",
          "domain": ["R", "D"],
          "range": ["red", "blue"],
          "default": "green"
        }
  ],
  "marks": [
        {
          "type": "points",
          "from": {
                "data": "table"
          },
          "properties": {
                "x": {
                  "scale": "x",
                  "field": "x"
                },
                "y": {
                  "scale": "y",
                  "field": "y"
                },
                "fillColor": {
                  "scale": "pointcolor",
                  "field": "recipient_party"
                },
                "size": {
                  "value": 5
                }
          }
        }
  ]
}

This results in the following chart:

Each point is colored according to recipient party. Values of R (republican) are colored red, D (democrat) are colored blue, and everything else is colored green.

To activate blend accumulation, add the "accumulator": "blend" property to an ordinal scale.

{
  "name": "pointcolor",
  "type": "ordinal",
  "domain": ["R", "D"],
  "range": ["red", "blue"],
  "default": "green",
  "accumulator": "blend"
}

This generates the following chart:

Activating blend accumulation shows you where one party is more dominant in a particular area. The COUNT aggregation is now being applied for each category, and the colors associated with each category are blended according to the final percentage of each category per pixel.

Note: Unlike in density mode, a field property is required in mark properties that reference blend accumulator scales.

Percentage Mode

Percentage (pct) mode can help you visualize how prevalent a specific category is based on a percantage. Any scale can be used in percentage mode, but the domain values must be between 0 and 1, where 0 is 0% and 1 is 100%.

Using the political donations database, you can determine where the recipient_party of “R” (republican) is more prevalent.

Here’s the color scale:

{
  "name": "pointcolor",
  "type": "threshold",
  "domain": [0.33, 0.66],
  "range": ["blue", "purple", "red"],
  "accumulator": "pct",
  "pctCategory": "R"
}

And the resulting image:

Using the threshold scale, anything colored blue is between 0%-33% republican, purple is 33%-66% republican, and red is 66%-100% republican.

pctCategory is a required property for percentage mode and can be numeric or a string. A string refers to a string value from a dictionary-encoded column.

You can modify the example to use a numeric value for pctCategory. First, modify the SQL in the Vega to select the contribution amount for each data point:

"SELECT conv_4326_900913_x(lon) as x, conv_4326_900913_y(lat) as y,amount,rowid FROM contributions WHERE (conv_4326_900913_x(lon) between -8274701.640628147 and -8192178.083370286) AND (conv_4326_900913_y(lat) between 4946220.843530051 and 5008055.72186748) LIMIT 2000000"

Now use the amount as the driving field for the pct accumulator scale:

"fillColor": {"scale": "pointcolor","field": "amount"},

Now, change the pct scale to the following:

{
  "name": "pointcolor",
  "type": "threshold",
  "domain": [0.33, 0.66],
  "range": ["blue", "purple", "red"],
  "accumulator": "pct",
  "pctCategory": 1000
}

This results in the following output, showing where thousand-dollar contributions are most prevalent:

You can use the pctCategoryMargin property to buffer numeric pctCategory values, so you can use a range for the numeric category.

Using Transform Aggregation

You can create Vega-based visualizations with render properties that are driven by aggregated statistics. You can use Vega transform aggregation and formula expressions to automate the process of gathering statistical information about a rendered query. By doing so, you do not have to run an SQL prequery to get the information, thereby reducing the time it takes to process and render a chart.

The following examples show how to use transforms in Vega to do the following:

Render a heatmap that is colored using dynamic statistics of the bins
Create a geo pointmap with different transform modes

NOTE: You can see Vega examples in the . For more information about the OmniSci Vega engine, see .

Rendering a Heatmap Colored by Bin Statistics

The following heatmap example demonstrates the benefits of Vega transforms for performance and reducing redundancy:

First, the example shows using an SQL expression to render a heatmap, as well as an additional expression to color the hexagonal bins according to the min and max of the cnt value of the aggregated bins from the query.
Then, you will see how to render the heatmap and color the bins directly in Vega by using source data definitions and performing aggregation transforms on that data, decreasing chart rendering time and redundancy.

SQL Queries

The following is a typical SQL query used for rendering a hexagonal heatmap:

SELECT
        reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as x,
        reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as y,
        count(*) as cnt
FROM tweets_nov_feb
WHERE ((lon >= -157 AND lon <= 157) AND (lat >= -63 AND lat <= 81))
GROUP BY x, y;

To color the hexagonal bins according to the min and max of the cnt value of the bins from the query, you need to run a prequery to gather these statistics manually. Here, this is done using a subquery SQL statement:

    SELECT
             min(cnt), max(cnt)
    FROM (SELECT
       reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as x,
       reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as y,
       count(*) as cnt
FROM tweets_nov_feb
WHERE ((lon >= -157 AND lon <= 157) AND (lat >= -63 AND lat <= 81))
GROUP BY x, y
    );

The values returned from this query can then be embedded in the Vega code to color the heatmap bins. Notice that the second query does an aggregation over the query, effectively running the query twice.

To avoid the redundancy and expense of running the query twice, you can instead specify the aggregation in Vega.

Render the Heatmap in Vega

The following Vega code renders the heatmap colored by aggregated statistics using transforms.

{
  "width": 897,
  "height": 647,
  "data": [
        {
          "name": "heatmap_query",
          "sql": "SELECT reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as x, reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon),conv_4326_900913_x(-157),conv_4326_900913_x(157),conv_4326_900913_y(lat),conv_4326_900913_y(-63),conv_4326_900913_y(81),9.9667,11.5085,0,0,897,647) as y, count(*) as cnt FROM tweets_nov_feb WHERE ((lon >= -157 AND lon <= 157) AND (lat >= -63 AND lat <= 81)) GROUP BY x, y"
        },
        {
          "name": "heatmap_stats",
          "source": "heatmap_query",
          "transform": [
                {
                  "type": "aggregate",
                  "fields": ["cnt", "cnt"],
                  "ops":    ["min", "max"],
                  "as":     ["mincnt", "maxcnt"]
                }
          ]
        }
  ],
  "scales": [
        {
          "name": "heat_color",
          "type": "quantize",
          "domain": {"data": "heatmap_stats", "fields": ["mincnt", "maxcnt"]},
          "range": ["#115f9a", "#1984c5", "#22a7f0", "#48b5c4", "#76c68f",
                                "#a6d75b", "#c9e52f", "#d0ee11", "#d0f400"
          ]
        }
  ],
  "marks": [
        {
          "type": "symbol",
          "from": {
                "data": "heatmap_query"
          },
          "properties": {
                "shape": "hexagon-horiz",
                "xc": {
                  "field": "x"
                },
                "yc": {
                  "field": "y"
                },
                "width": 9.9667,
                "height": 11.5085,
                "fillColor": {
                  "scale": "heat_color",
                  "field": "cnt"
                }
          }
        }
  ]
}

The data section named heatmap_stats has a source data table defined by the "source": "heatmap_query" line:

"name": "heatmap_stats",
"source": "heatmap_query",

The "heatmap_stats" data takes as input the "heatmap_query" data, which is the data supplied by the SQL query. Use the source data type to apply intermediary steps or expressions (transforms) to the input source data.

To color the data according to the range of values defined by two standard deviations from the mean, edit the "heatmap_stats" section as follows to:

Aggregate the minimum, maximum, average, and sampled standard deviation of the count column.
Use formula expressions to calculate values that are two standard deviations from the average.

{
   "name": "heatmap_stats",
   "source": "heatmap_query",
   "transform": [
     {
       "type": "aggregate",
       "fields": ["cnt", "cnt", "cnt", "cnt"],
       "ops":    ["min", "max", "avg", "stddev"],
       "as":     ["mincnt", "maxcnt", "avgcnt", "stdcnt"]
     },
     {
       "type": "formula",
       "expr": "max(mincnt, avgcnt-2*stdcnt)",
       "as": "mincnttouse"
     },
     {
       "type": "formula",
       "expr": "min(maxcnt, avgcnt+2*stdcnt)",
       "as": "maxcnttouse"
     }
   ]
 }

Then, reference these values in the scale domain:

{
   "name": "heat_color",
   "type": "quantize",
   "domain": {"data": "heatmap_stats", "fields": ["mincnttouse", "maxcnttouse"]},
   "range": ["#115f9a", "#1984c5", "#22a7f0", "#48b5c4", "#76c68f",
             "#a6d75b", "#c9e52f", "#d0ee11", "#d0f400"
   ]
 }

Performing these calculations in Vega improves performance because the SQL query is only run once and the aggregated statistics are done “on the fly.” Because the query is not repeated in a statistical prequery step, you can reduce the full render time by half by performing the statistics step in Vega at render time.

Creating a Geo Pointmap Using Transforms to Drive Color and Size

This section shows how to use Vega tranforms to drive the color and size of points in a geo pointmap. Specifically, it show examples using the following aggregation transforms:

distinct: An array of distinct values from an input data column.

median: The median of an input data column.

quantile: An array of quantile separators; operates on numeric columns and takes the following pameters:

numQuantiles: The number of contiguous intervals to create; returns the separators for the intervals. The number of separators equals numQuantiles - 1.
includeExtrema: Whether to include min and max values (extrema) in the resulting separator array. The size of the resulting separator array will be numQuantiles + 1.

As with the heatmap example described earlier, using Vega transforms eliminate the need for an SQL prequery and significantly improves performance for dynamic operations.

The examples that follow use a Twitter dataset to create a geo pointmap.

Distinct

In the following example, the size of the points in a geo pointmap are defined by the numeric range two standard deviations from the average number of followers of the input data. The color of the points is driven by the distinct languages of the input data. To calculate the distinct languages, you could run a prequery using DISTINCT and then populate a Vega color scale with the results. However, the query would need to be run before every render update if the distinct data is meant to be dynamic, which would be very costly.

With the distinct Vega transform, this can be performed when evaluating the Vega code in the backend, so you do not need to run the prequery. This can improve performance considerably.

{
  "width": $width,
  "height": $height,
  "data": [
    {
      "name": "table",
      "sql": "SELECT conv_4326_900913_x(lon) as x,conv_4326_900913_y(lat) as y,followers,lang,rowid FROM tweets_2017_may WHERE (lon between $minLon and $maxLon AND lat between $minLat and $maxLat) LIMIT 200000"
    },
    {
      "name": "xformtable",
      "source": "table",
      "transform": [
        {
          "type": "aggregate",
          "fields": ["followers", "followers", "followers", "followers", "lang"],
          "ops":    ["min", "max", "avg", "stddev", "distinct"],
          "as":     ["minfol", "maxfol", "avgfol", "stdfol", "distinctlang"]
        },
        {
          "type": "formula",
          "expr": "max(minfol, avgfol-2*stdfol)",
          "as": "minfoltouse"
        },
        {
          "type": "formula",
          "expr": "min(maxfol, avgfol+2*stdfol)",
          "as": "maxfoltouse"
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "x",
      "type": "linear",
      "domain": [
        $minXBounds,
        $maxXBounds
      ],
      "range": "width"
    },
    {
      "name": "y",
      "type": "linear",
      "domain": [
        $minYBounds,
        $maxYBounds
      ],
      "range": "height"
    },
    {
      "name": "size",
      "type": "linear",
      "domain": {"data": "xformtable", "fields": ["minfoltouse", "maxfoltouse"]},
      "range": [
        1, 20
      ],
      "clamp": true
    },
    {
      "name": "color",
      "type": "ordinal",
      "domain": {"data": "xformtable", "field": "distinctlang"},
      "range": [
        "blue", "red", "green", "yellow", "magenta", "purple", "teal"
      ]
    }
  ],
  "marks": [
    {
      "type": "points",
      "from": {
        "data": "table"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": {
          "scale": "color",
          "field": "lang"
        },
        "size": {
          "scale": "size",
          "field": "followers"
        }
      }
    }
  ]
}

This Vega code results in this image:

Median

Outliers in a dataset can significantly skew statistics such as AVG and STDDEV. To mitigate this, you can use median and quantile to create a more meaningful probability distribution of the data. Median and quantiles are computed dynamically when Vega is evaluated and can be used to drive different render properties.

The following hexmap example uses median to drive the color of the hex bins. Notice in the final render that roughly half of the bins are colored red, and the other half are blue.

{
  "width": $width,
  "height": $height,
  "data": [
    {
      "name": "heatmap_query",
      "sql": "SELECT reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon),$minXBounds,$maxXBounds,conv_4326_900913_y(lat),$minYBounds,
	  $maxYBounds,9.931506849315069,11.467916305821335,0,0,$width,$height) as x,reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon),$minXBounds,$maxXBounds,
	  conv_4326_900913_y(lat),$minYBounds,$maxYBounds,9.931506849315069,11.467916305821335,0,0,$width,$height) as y, count(*) as color FROM tweets_2017_may 
	  WHERE ((lon >= $minLon AND lon <= $maxLon) AND (lat >= $minLat AND lat <= $maxLat)) GROUP BY x, y"
    },
    {
      "name": "heatmap_statistics",
      "source": "heatmap_query",
      "transform": [
        {
          "type": "aggregate",
          "fields": ["color"],
          "ops":    ["median"],
          "as":     ["mediancolor"]
        }
      ]
    }
  ],
  "scales": [
    {
      "name": "heat_color",
      "type": "threshold",
      "domain": {"data": "heatmap_statistics", "field": "mediancolor"},
      "range": [
        "blue", "red"
      ]
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {
        "data": "heatmap_query"
      },
      "properties": {
        "shape": "hexagon-horiz",
        "xc": {
          "field": "x"
        },
        "yc": {
          "field": "y"
        },
        "width": 9.931506849315069,
        "height": 11.467916305821335,
        "fillColor": {
          "scale": "heat_color",
          "field": "color"
        }
      }
    }
  ]
}

Quantile

The quantile function takes two additional parameters:

numQuantiles is the number of contiguous intervals to create and returns the separators for the intervals. The number of returned separators is numQuantiles - 1.
includeExtrema is a true or false value indicating whether to include the extrema (min and max) in the resulting separator array. If true, the number of returned values is numQuantiles + 1.

To see how a quantile works, consider a query that results in this set of values for "followers":

{3, 6, 7, 8, 8, 10, 13, 15, 16, 20}

With a quantile operator defined as {"type": "quantile", "numQuantiles": 4}, the result of the operator would be the following array:

[7, 9, 15]

25% of the data has less than 7 followers, 25% has between 7 and 9, 25% has between 9 and 15, and 25% has more than 15.

With a quantile operator defined as {"type": "quantile", "numQuantiles": 4, "includeExtrema": true}, the result of the operator would be the following array:

[3, 7, 9, 15, 20].

With "includeExtrema" == true, the min and max are included in the resulting array, so 25% of the data has between 3 and 7 followers, 25% has between 7 and 9, 25% has between 9 and 15, and 25% has between 15 and 20.

The following Vega code snippet gets the octiles (8-quantiles) and sextiles (6-quantiles) of a column called "followers":

...
{
 "name": "xformtable",
 "source": "table",
 "transform": [
   {
     "type": "aggregate",
     "fields": ["followers", "followers"],
     "ops":    [{"type": "quantile", "numQuantiles": 8}, {"type": "quantile", "numQuantiles": 6}],
     "as":     ["octile_fol", "sextile_fol"]
   }
 ]
}
...

Here is a more complete example using sextiles. Notice in the resulting image approximately the same number of hexagons appears in each of the six quantile groups colored blue to red, from left to right.

{
  "width": $width,
  "height": $height,
  "data": [
    {
      "name": "heatmap_query",
      "sql": "SELECT reg_hex_horiz_pixel_bin_x(conv_4326_900913_x(lon),$minXBounds,$maxXBounds,conv_4326_900913_y(lat),
	  $minYBounds,$maxYBounds,9.931506849315069,11.467916305821335,0,0,$width,$height) as x,reg_hex_horiz_pixel_bin_y(conv_4326_900913_x(lon),
      $minXBounds,$maxXBounds,conv_4326_900913_y(lat),$minYBounds,$maxYBounds,9.931506849315069, 11.467916305821335,0,0,$width,$height) as y, 
      count(*) as color FROM tweets_2017_may WHERE ((lon >= $minLon AND lon <= $maxLon) AND (lat >= $minLat AND lat <= $maxLat)) GROUP BY x, y"
    },
    {
      "name": "heatmap_statistics",
      "source": "heatmap_query",
      "transform": [
        {
          "type": "aggregate",
          "fields": ["x"],
          "ops":    [{"type": "quantile", "numQuantiles": 6}],
          "as":     ["sextilex"]
        }
      ]
    }
  ],
  "scales": [
     {
      "name": "heat_color",
      "type": "threshold",
      "domain": {"data": "heatmap_statistics", "field": "sextilex"},
      "range": [
        "rgb(0, 0, 255)", "rgb(51, 0, 204)", "rgb(102, 0, 153)", "rgb(153, 0, 102)", "rgb(204, 0, 51)", "rgb(255, 0, 0)"
      ]
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {
        "data": "heatmap_query"
      },
      "properties": {
        "shape": "hexagon-horiz",
        "xc": {
          "field": "x"
        },
        "yc": {
          "field": "y"
        },
        "width": 9.931506849315069,
        "height": 11.467916305821335,
        "fillColor": {
          "scale": "heat_color",
          "field": "x"
        }
      }
    }
  ]
}

Improving Rendering with SQL Extensions

Marks defined in Vega specify how to render data-backed geometric primitives for a visualization. Because these are visual primitives, the default units for defining position and size are in pixels. Pixel units usually are not directly representable by the data space, so the driving data must be mapped to pixel space to be used effectively. In many cases, this data space-to-pixel space mapping can be handled with scales. However, in a number of instances, particularly in geo-related cases, you want to size the primitives in world space units, such as meters. These units cannot be easily converted to pixel units using Vega scales.

This tutorial describes how to use available SQL extension functions in Vega to map meters to pixels, thereby improving map rendering.

Let's look at a basic example. The following uses a public polical contributions dataset, and draws circles for the points positioned using the GPS location of the contributor. The circles are colored by the recipient's political party affiliation and sized to be 10 pixels in diameter:

{
  "width": 1146,
  "height": 1116,
  "data": [
    {
      "name": "pointmap",
      "sql": "SELECT lon, lat, recipient_party, rowid FROM fec_contributions_oct WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND (lat BETWEEN 21.99999999999997 AND 53.999999999999716) LIMIT 2000000"
    }
  ],
  "scales": [
    {
      "name": "pointmap_fillColor",
      "type": "ordinal",
      "domain": ["D","R","I"],
      "range": ["deepskyblue","crimson","gold"],
      "default": "peru",
      "nullValue": "gainsboro"
    }
  ],
  "projections": [
    {
      "name": "merc",
      "type": "mercator",
      "bounds": {
        "x": [-119.49268182426508,-76.518508633361],
        "y": [21.99999999999997,53.999999999999716]
      }
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {"data": "pointmap"},
      "properties": {
        "xc": {"field": "lon"},
        "yc": {"field": "lat"},
        "fillColor": {"scale": "pointmap_fillColor","field": "recipient_party"},
        "shape": "circle",
        "width": 10,
        "height": 10
      },
      "transform": {"projection": "merc"}
    }
  ]
}

The resulting render, composited over a basemap courtesy of Mapbox, looks like this:

Because the circles are sized using pixels, if you zoom in, the circles stay sized at a fixed 10 pixels. The size of the dots does not stay relative to the area of the map originally covered:

...

"projections": [
  {
    "name": "merc",
    "type": "mercator",
    "bounds": {
      "x": [-112.67762110616854,-112.15822455336946],
      "y": [40.30629722096336,40.69091660556256]
    }
  }
],

...

The resulting render in this case looks like this:

To keep the size of the points relative to an area on the map, you need to define the size of the pixels in meters. Currently, Vega does not provide a scale that maps meters in a mercator-projected space to pixel units. To bypass this limitation, you can use an OmniSci extension function that performs meters-to-pixels conversion using a mercator-projected space.

For scalar columns, such as lon/lat, use the following:

For geo POINT columns, you use:

Because the extension functions can only return scalar values, each dimension (width and height) must have its own extension function.

To apply these functions to the previous example, add these extension functions to your SQL code, and use the results of the extension functions to determine the width and height of the circles. The following example sizes the points to 1 km in diameter:

javascript
{
  "width": 1146,
  "height": 1116,
  "data": [
    {
      "name": "pointmap",
      "sql": "SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -119.49268182426508, -76.518508633361, 1146, 1) as width
	  convert_meters_to_merc_pixel_height(1000, lon, lat, 21.99999999999997, 53.999999999999716, 1116, 1) as height, recipient_party as color, 
	  rowid FROM fec_contributions_oct WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND
	  (lat BETWEEN 21.99999999999997 AND 53.999999999999716) LIMIT 2000000"
    }
  ],
  "scales": [
    {
      "name": "pointmap_fillColor",
      "type": "ordinal",
      "domain": ["D","R","I"],
      "range": ["deepskyblue","crimson","gold"],
      "default": "peru",
      "nullValue": "gainsboro"
    }
  ],
  "projections": [
    {
      "name": "merc",
      "type": "mercator",
      "bounds": {
        "x": [-119.49268182426508,-76.518508633361],
        "y": [21.99999999999997,53.999999999999716]
      }
    }
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {"data": "pointmap"},
      "properties": {
        "xc": {"field": "lon"},
        "yc": {"field": "lat"},
        "fillColor": {"scale": "pointmap_fillColor","field": "recipient_party"},
        "shape": "circle",
        "width": {"field": "width"},
        "height": {"field": "height"}
      },
      "transform": {"projection": "merc"}
    }
  ]
}

Note the differences in this Vega code compared to the earlier example; two projections were added to the SQL code:

convert_meters_to_merc_pixel_width(1000, lon, lat, -119.49268182426508, -76.518508633361, 1146, 1) as width
convert_meters_to_merc_pixel_height(1000, lon, lat, 21.99999999999997, 53.999999999999716, 1116, 1) as height

This converts 1 km to a pixel value in width/height based on the current view of a mercator-projected map.

The width/height calculated here is now used to drive the width/height of the circle using this JSON in the Vega mark:

"width": {"field": "width"},
"height": {"field": "height"}

The resulting render looks like this:

Now, if you zoom in, the size of the points stays relative to the map:

...

"projections": [
  {
    "name": "merc",
    "type": "mercator",
    "bounds": {
      "x": [-112.67762110616854,-112.15822455336946],
      "y": [40.30629722096336,40.69091660556256]
    }
  }
],

...

...with the following resulting render:

The following code zooms in a bit more:

...

"projections": [
  {
    "name": "merc",
    "type": "mercator",
    "bounds": {
      "x": [-112.52569969159018,-112.37518840098163],
      "y": [40.527435942877986,40.63875135019538]
    }
  }
],

...

and results in the following render:

Notice that the WHERE clause of the SQL filters out points not in view:

... WHERE (lon BETWEEN -119.49268182426508 AND -76.518508633361) AND (lat BETWEEN 21.99999999999997 AND 53.999999999999716) ...

However, when zoomed in far enough, a point can disappear, even though its associated circle is still in view. This occurs because only the center of the circle is checked in this filter and not the whole rendered circle.

To illustrate this, consider a render of the following query:

SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.49286564041051, -112.34235434980197, 1146, 1) as width,
convert_meters_to_merc_pixel_height(1000, lon, lat, 40.53172840847458, 40.64303667787769, 1116, 1) as height, recipient_party, 
rowid FROM fec_contributions_oct WHERE (lon BETWEEN -112.49286564041051 AND -112.34235434980197) AND (lat BETWEEN 40.53172840847458 
AND 40.64303667787769) LIMIT 2000000

The resulting image looks like this:

If you pan to the left, the blue dot disappears, although it should still be visible. Here is the query:

SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.48984490770093, -112.33933361709238, 1146, 1) as width,
convert_meters_to_merc_pixel_height(1000, lon, lat, 40.5315287650088, 40.642837366411584, 1116, 1) as height, recipient_party, 
rowid FROM fec_contributions_oct WHERE (lon BETWEEN -112.48984490770093 AND -112.33933361709238) AND (lat BETWEEN 40.5315287650088 
AND 40.642837366411584) LIMIT 2000000

...and the resulting image:

To alleviate this issue, you can use the extension functions as a filter:

For scalar columns (such as lon/lat): is_point_size_in_merc_view
For geo POINT columns: is_point_size_in_view

These extension functions take as arguments the parameters of the view along with the point size in meters, and return true if the point is in the defined view, or false otherwise.

Refering back to the original example, replace the WHERE clause with its is_point_size_in_merc_view equivalent:

SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.49286956397471, -112.34028759586535, 1146, 1) as width,
convert_meters_to_merc_pixel_height(1000, lon, lat, 40.531763370983555, 40.64460162657784, 1116, 1) as height, recipient_party, 
rowid FROM fec_contributions_oct WHERE is_point_size_in_merc_view(lon, lat, 1000, -112.49286956397471, -112.34028759586535, 
40.531763370983555, 40.64460162657784) LIMIT 2000000

This results in:

Now, pan slightly to the left again:

SELECT lon, lat, convert_meters_to_merc_pixel_width(1000, lon, lat, -112.48980727316768, -112.33722530505833, 1146, 1) as width, convert_meters_to_merc_pixel_height(1000, lon, lat, 40.531763370983555, 40.64460162657784, 1116, 1) as height, recipient_party, 
rowid FROM fec_contributions_oct WHERE is_point_size_in_merc_view(lon, lat, 1000, -112.48980727316768, -112.33722530505833, 
40.531763370983555, 40.64460162657784) LIMIT 2000000

The result is:

Notice that the blue dot now passes the filter and stays in view.

Current Limitations

This approach is not an accurate representation of area on a map. It provides a reasonable approximate, but more error is introduced as you approach the poles, because this approach works only in two dimensions. As you approach the poles, you would realistically see areas that are oblong and egg-shaped. However, this approach works reasonably well for most inhabitable geo locations.
The symbol mark types are procedurally generated and use a simple POINT primitive in the underlying graphics API. This primitive has a maximum pixel size for this primitive. The limit is graphics driver–implementation defined, but testing shows this limit to be 2000 pixels in diameter. This limit can have an effect if you zoom in tight on areas where the circles have large areas. You may see points disappear, similar to the filtering issue described earlier. This most likely occurs because the ultimate width/height generated by the convert_meters_to_pixels extension functions exceed this limit.
As a workaround, use the legacysymbol mark type instead of symbol. The legacysymbol mark type does not render the shape procedurally, so it is not affected by this limit. The legacysymbol mark was deprecated in favor of the improved rendering performance of the procedural approach.
When you use extension functions in SQL, you cannot use Vega scales to do further mapping; for example, you cannot use the contribution "amount" column to drive the size of the points in meters with a Vega scale. Any additional mapping must be done in the SQL, which may not be trivial depending on the complexity of the mapping.

Vega Reference Overview

A Vega specification is a JSON-formatted structure that describes a visualization, which can be sent to the back end for rendering. This document introduces the the Vega specification syntax and provides links to topics that provide more details about each Vega property.

For examples of using Vega, see Tutorials. You can also see and edit examples in Try Vega.

Specification Language Syntax

The Vega specification includes properties for describing the source data, mapping the data to the visualization area, and visual encoding. The root Vega specification supported by OmniSci has the following JSON structure and top-level properties:

{
  "width": <number>,
  "height": <number>,
  "data": [],
  "projections": [],
  "scales": [],
  "marks": []
}

Property

Type

Description

width and height

unsigned integer

Visualization area width and height, in pixels. Both properties are required. Example: Set the viewing area width to 384 pixels and the height to 564 pixels:

array

Source data. The Vega data model uses tabular data, similar to a spreadsheet. Organized in rows with any number of named columns. JSON format:

array

Projection data. Maps longitude and latitude data to projected x and y coordinates. JSON format:

array

Data-to-visualization area mapping. Maps visually encoded data values to pixel positions with attributes, such as color. JSON format:

array

Geometric primitive used to visually encode data. JSON format:

Format Rules

Property names are case-sensitive.
Property values are typed.
Unsupported properties are ignored by the rendering engine.

data Property

Use the Vega data property to specify the visualization data sources by providing an array of one or more data definitions. A data definition must be an object identified by a unique name, which can be referenced in other areas of the specification. Data can be statically defined inline ("values":), can reference columns from a database table using a SQL statement ("SQL":), or can be loaded from an existing data set ("source":).

JSON format:

"data": [
  {
    "name": <dataID>,
    "format": {
      "type": "lines" | "polys",
      "coords": {
        "x": <array>
        "y": <array>
      }
      "layout": "interleaved" | "sequential"
    "values": <valueSet> | "SQL": <dataSource> | "source": <dataSource>,
    "transform": [
      {
        "type": "aggregate"
         "fields": ["string":"string"]
         "ops": ["keyword":"keyword"]
         "as": ["string":"string"]
      }  
  },
  {
     ...
  }
]

The data specification has the following properties:

Property

Data Type

Required

Description

string

User-assigned database table name.

string/object

How the data are parsed. polys and lines are the only supported format mark types and are for rendering purposes only. Use the single string "short form" for polygon and simple linestring renders. Use the JSON object "long form" to provide more information for rendering more complex line types.

string

Data source:

values: Embedded, static data values defined inline as JSON.

sql: A SQL query that loads the data.

string

An array of transforms to perform on the input data. The output of the transform pipeline then becomes the value of this data set. Currently, can only be used with source data set types.

boolean

If true, automatically adds rowid column(s) to the SQL statement, which is required for hit-testing using the get_result_row_for_pixel endpoint.

Examples

Load discrete x- and y column values using the values database table type:

vegaSpec = {
    width: 384,
    height: 564,
    data: [
        {
          name: "coordinates",
          values: [ {"x":0, "y":3}, {"x":1, "y":5} ],
    scales: [ ... elided ... ],
    marks: [ ... elided ... ]
};

Use the sql database table type to load latitude and longitude coordinates from the tweets_data database table:

vegaSpec = {
    width: 384,
    height: 564,
    data: [
        {
          name: "tweets",
          sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
        }
    ],
    scales: [ ... elided ... ],
    marks: [ ... elided ... ]
};

Use the source type to use the data set defined in the sql data section and perform aggregation transforms:

vegaSpec = {
      width: 384,
      height: 564,
      data: [
              {
                      name: "tweets",
                      sql: "SELECT lon as x, lat as y FROM tweets_data WHERE (lon >= -32 AND lon < 66) AND (lat >= -45 AND lat < 68)"
              },
              {
                      name: "tweets_stats",
                      source: "tweets",
                      transform: [
                              {
                                      type: "aggregate",
                                      fields: ["x", "x"],
                                      ops: ["min", "max"],
                                      as: ["minx", "maxx"]
                              }
                      ]
              },
      ],
      scales: [ ... elided ... ],
      marks: [ ... elided ... ]
}

Data Properties

name

The name property uniquely identifies a data set, and is used for reference by other Vega properties, such as the Marks property.

format

The format property indicates that data preprocessing is needed before rendering the query result. If this property is not specified, data is assumed to be in row-oriented JSON format.

This property is required for Polys and Lines mark types. The property has one of two forms:

The "short form", where format is a single string, which must be either polys or lines. This form is used for all polygon rendering, and for fast ‘in-situ’ rendering of LINESTRING data.
The "long form", where format is an object containing other properties, as follows:

Format Property

Description

type

Marks property type:

coords

Applies to type: lines.

Specifies x and y arrays, which must both be the same size.

This permits column extraction pertaining to line rendering and place them in a rendering buffer. The coords property also dictates the ordering of points in the line.

Separate x- and y-array columns are also supported.

layout

(optional) Applies to type: lines.

Specifies how vertices are packed in the vertices column. All arrays must have the same layout:

interleaved: (default) All elements corresponding to a single vertex are ordered in adjacent pairs. For example, x0, y0, x1, y1, x2, y2.
sequential: All elements of the same axis are adjacent. For example, x0, x1, x2, y0, y1, y2.

For lines, each row in the query corresponds to a single line.

This lines format example of interleaved data renders ten lines, all of the same length.

"data": [
  {
    "name": "table",
    "sql": "select lineArrayTest.rowid as rowid, vertices, color from lineArrayTest order by color desc limit 10;",
    "format": {
      "type": "lines",
      "coords": {
        "x": ["vertices"],
        "y": [
          {"from": "vertices" }
        ]
      },
      "layout": "interleaved"
    }
  }
]

In this lines format example of sequential data, x only stores points corresponding to the x coordinate and y only stores points corresponding to the y coordinate. Make sure that columns only contain a single coordinate if using multiple columns in sequential layout.

"data": [
  {
    "name": "table",
    "sql": "select lineArrayTestSeq.rowid as rowid, x, y, color from lineArrayTestSeq order by color desc limit 10;",
    "format": {
      "type": "lines",
      "coords": {
        "x": ["x"],
        "y": ["y"]
      },
    "layout": "sequential"
    }
  }
],

The following example shows a fast "in-situ" LINESTRING format:

"data": [
  {
    "name": "table",
    "format": "lines",
    "sql": "SELECT rowid, linestring_column, ... FROM ..."
  }
]

The following example shows a polys format:

"data": [
  {
    "name": "polys",
    "format": "polys",
    "sql": "SELECT ... elided ..."
  }
]

Data Source

The database table source property key-value pair specifies the location of the data and defines how the data is loaded:

Key

Value

Description

source

String

Data is loaded from an existing data set.

sql

SQL statement

Data is loaded using a SQL statement.

values

JSON data

Data is loaded from static, key-value pair data definitions.

transform

Transforms process a data stream to calculate new aggregated statistic fields and derive new data streams from them. Currently, transforms are specified only as part of a source data definition. Transforms are defined as an array of specific transform types that are executed in sequential order. Each element of the array must be an object and must contain a type property. Currently, two transform types are supported: aggregate and formula.

Type

Description and Properties

aggregate

Performs aggregation operations on input data columns to calculate new aggregated statistic fields and derive new data streams from them. The following properties are required:

fields: An array of strings referencing columns from the sourced data table.

ops: An array of keyword strings and objects indicating the predefined operation to perform. For objects, the type property is required to name the type of the aggregation function. Supported operators:

count: The total count of data objects in the group.
countdistinct: The number of distinct values in an input data column; operates only on numeric or dictionary-encoded string columns.
distinct: An array of distinct values from an input data column; operates only on numeric or dictionary-encoded string columns.
max: The maximum field value.
mean / average / avg: The mean (average) field value.
median: The median of an input data column; operates only on numeric columns.
min: The minimum field value.
missing: The count of field values that are null or undefined.
quantile: An array of quantile separators; see https://en.wikipedia.org/wiki/Quantile. Operates only on numeric columns:
- numQuantiles: The number of contiguous intervals to create; returns the separators for the intervals. The number of separators equals numQuantiles - 1. Range: 1-100. Default: 4
- includeExtrema: Whether to include min and max values (extrema) in the resulting separator array. When true, the resulting array size is numQuantiles + 1. Values: true or false. Default: false
sum: The sum of field values.
stddev: The sample standard deviation of field values.
stddevp: The population standard deviation of field values.
valid: The count of field values that are not null nor undefined.
variance: The sample variance of field values.
variancep: The population variance of field values.

as: An array of strings used as output names of the operations for later reference.

formula

Evaluates a user-defined expression. The following properties are required:

as: A string used as an output name for later reference.

Note: Currently, expressions can only be performed against outputs (as values) from prior aggregate transforms.

See Tutorial: Using Transforms for more detailed examples.

enableHitTesting

If true, automatically adds rowid column(s) to the SQL statement where appropriate, enabling the data block for hit-testing using the get_result_row_for_pixel endpoint.

If false, the data block is not automatically hit-test enabled, and any later get_result_row_for_pixel calls return empty hit-test results.

If the enableHitTesting property is not present, the following legacy behavior is used as the default:

If the SQL statement represents a projection query, hit-testing is enabled if a rowid column is explicitly projected.
If the SQL statement represents an aggregate query, hit-testing is always enabled.

This legacy behavior will likely be deprecated and removed in an upcoming version of OmniSci. At that point, the enableHitTesting property will be required for activating hit-test support for the data.

projections Property

Vega projections map longitude and latitude data to projected x and y coordinates. When working with geospatial data in OmniSci, you can use projections to define geographic points and regions.

General projections property JSON format:

"projections": [
       {
         "name": "<projectionName>",
         "type": "<projectionType>",
         "bounds": {
               "x": [<minLong>,<maxLong>],
               "y": [<minLat>,<maxLat>]
         }
       }
]

When you specify a projection, you must reference it in the Marks Property using the transform object. For example, if you define the projection my_mercator_projection:

"projections": [
{
   "name": "my_mercator_projection",
   "type": "mercator",
   "bounds": {
     "x": [-120.0, 120.0],
     "y": [-20.0,20.0]
   }
 }
 ]

you then reference it as follows:

"marks": [
{
   "type": "symbol",
   "from": { "data": "fec_contributions_oct" },
   "properties": { ... elided ... }
   "transform": {
      "projection": "my_mercator_projection"
   }
 }
 ]

The projections specification has the following properties:

Property

Data Type

Required

Description

name

string

User-assigned name of the projection.

type

string

Projection type. Currently supported types:

mercator: Mercator map projection.

bounds

object

Specifies the longitude and latitude bounding box for the projection. Default values:

x: [-180.0, 180.0]
y: [-85.0, 85.0]

Example

Use Vega projection projection alongside array columns:

{
      "width": 1024,
      "height": 1024,
      "data": [
              {
                      "name": "table",
                      "sql": "SELECT rowid, coords[1] as x, coords[2] as y FROM cities WHERE coords[1] BETWEEN $minLon AND $maxLon AND coords[2] BETWEEN $minLat AND $maxLat"
              }
      ],
      "projections": [
      {
              "name": "projection",
              "type": "mercator",
              "bounds": {
              "x": [-120.0, 120.0],
              "y": [-20.0, 20.0]
              }
      }
      ],
      "scales": [
      ],
      "marks": [
              {
                      "type": "symbol",
                      "from": {"data": "table"},
                      "properties": {
                              "shape": "circle",
                              "xc": {
                                      "field": "x"
                              },
                              "yc": {
                                      "field": "y"
                              },
                              "fillColor": "darkblue",
                              "width": 25,
                              "height": 25
                      },
                      "transform": {
                              "projection": "projection"
                      }
              }
      ]
}

scales Property

The Vega scales property maps visually encoded data values to pixel positions with attributes, such as color. See the documentation for additional background information about scales.

General scales property JSON format:

"scales": [
  {
    "name": <scaleID>,
    "type": <scaleType>,
    "domain": <inputValues>,
    "range": <outputValues>"
    "accumulator": <accumulatorType>
    "default": <defaultOutputValue>,
    "nullValue": <nullDataValue>
  },
  {
     ...
  }
],

The scales specification is one or more arrays with the following properties:

Note: As a general rule, limit the total number of domain and range values used to a maximum of 1000. Exceeding this limit can cause an error.

Example

vegaSpec = {
    width: 384,
    height: 564,
    data: [ ... elided ... ],
    scales: [
        {
            name: "x",
            type: "linear",
            domain: [ -100, 999 ],
            range: "width"
        },
        {
            name: "y",
            type: "linear",
            domain: [ 0, 500 ],
            range: "height"
        }
    ],
    marks: [ ... elided ... ]
};

Scales Properties

name

The name property uniquely identifies the scale for reference by other properties.

type

The type property specifies how to transform the input, domain data to output, range visual values. Vega supports the following transforms, categorized by quantitative, discrete, and discretizing scales:

Quantitative Scales

Discrete Scales

Discretizing Scales

domain

The domain field specifies the domain of input data values. For quantitative data, this can take the form of a two-element array.

Example:

Specify minimum and maximum input values.

domain: [ -100, 999 ]

For ordinal or categorical data, the domain can be an array of valid input values.

Example

Specify valid input data languages.

"domain": ["en",  "es", "fr"]

range

Scale range specifies the set of visual values. For numeric values, the range can take the form of a two-element array with minimum and maximum values. For ordinal or quantized data, the range can be an array of desired output values, which are mapped to elements in the specified domain.

Scale ranges can be specified in the following ways:

As an array of static values: "range": [0, 500] or "range": ['a', 'b', 'c'].
Using pre-defined literals: "range": "width" or "range": "height".

Example

Specify a color scale that quantizes input values between 0 and 100 among five visual output colors.

{
  name: "color",
  type: "quantize",
  domain: [ 0, 100 ],
  range: [ "#115f9a", "#1984c5", "#c9e52f", "#d0ee11", "#d0f400"
  ]
}

Example

Specify a y scale that linearly maps input values between 0 and 500 to the height of the visualization area.

{
    name: "y",
    type: "linear",
    domain: [ 0, 500 ],
    range: "height"
}

default

The default scales property specifies the output value to use when the input domain value does not map to the range.

The default property is not applicable to the threshold scale type, which maps domain values outside of the range to either the lowest or highest range value.

accumulator

The accumulator property enables you to identify regional density of data in a layer of a backend render and apply pixel coloring based on the accumulation mode that you have defined. Each data point is rendered individually, providing an accurate representation of data distribution in a spatial setting.

Example

Apply a density accumulator to a linear scale named pointcolor:

{
  "name": "pointcolor",
  "type": "linear",
  "domain": [0.0,1.0],
  "range": ["blue","red"],
  "clamp": true,
  "accumulator": "density",
  "minDensityCnt": 1,
  "maxDensityCnt": 100
}

The color at a pixel is determined by normalizing per-pixel aggregated counts and using that value in the scale function to calculate a color. Normalization is performed according to the required minDensityCnt and maxDensityCnt properties. After normalization, minDensityCnt == 0 and maxDensityCnt == 1.

minDensityCnt and maxDensityCnt can have explicit integer values or use one of the following keywords to compute statistical information about per-pixel counts: min, max, -1stStdDev, -2ndStdDev, 1stStdDev, 2ndStdDev.

marks Property

Marks visually encode data using geometric primitives.

General JSON format:

"marks": [
  {
    "type": points | line | polys | symbol,
    "from": { data: <dataSourceID> },
    "properties": { <propName>: <propVal> }, ... { <propName>: <propVal> }
    "transform": { <transformType>: <transformName> }
  },
  {
    ...
  }
],

A Vega marks specification includes the following properties:

Each marks property is associated with the specified property.

Marks are rendered in marks property array order.

Apply the x and y scales to the x and y database table columns to scale the data to the visualization area width and height. For example:

const exampleVega = {
  "width:" 384,
  "height:" 564,
  "data:" [ ... elided ... ],
  "scales:" [
    {
      "name:" "x",
      "type:" "linear",
      "domain:" [-3650484.1235206556,7413325.514451755],
      "range:" "width"
    },
    {
      "name:" "y",
      "type:" "linear",
      "domain:" [-5778161.9183506705, 10471808.487466192],
      "range:" "height"
    }
  ],
  "marks:" [
    {
      "type:" "points",
      "from:" { "data:" "tweets" },
      "properties:" {
        "x:" { "scale:" "x", "field:" "x" },
        "y:" { "scale:" "y","field:" "y"}
      }
    }
  ]
};

marks Properties

type

Marks must include a type property that specifies the geometric primitive to use to render the data.

points Type

Specify x and y coordinate values using either constants, or domain and range values of a data reference. If the from property is not specified, the x and y properties fields must be constants.

points Examples

Define a point with size, color, and opacity:

{
  "width" : 1024,
  "height" : 1024,
  "data": [
    {
      "name" : "table",
        "values": [
          {"x": 412, "y": 512, "val": 0.9,"color": "red"},
          {"x": 512, "y": 512, "val": 0.3, "color": "violet"},
          {"x": 612, "y": 512, "val": 0.5,"color": "green"}
        ]
     }
      ],
  "marks" : [
    {
      "type" : "points",
      "from" : {"data" : "table"},
        "properties" : {
          "x" : { "field" : "x" },
          "y" : { "field" : "y" },
          "fillColor" : {
              "field" : "color"
                      },
                      "size" : 150.0,
                      "fillOpacity" : {
                              "field" : "val"
                      },
                      "opacity" : 0.8
               }
             }
       ]
     }

Associate the points geometric primitive with tweets data items.

vegaSpec = {
    "width": 384,
    "height": 564,
    "data": [
        {
            "name": "tweets",
            "sql": "SELECT  ... elided ... "
        }
    ],
    "scales": [ ... elided ... ],
    "marks": [
        {
            "type": "points",
            "from": { data: "tweets" },
            "properties": { ... elided ... }
        },
        { ... elided ... }
    ]
};

lines Type

Specifying the data format property as lines causes the rendering engine to assume a lines database table layout and to extract line-related columns from the table.

lines Example

{
  "type": "lines",
  "from": {"data": "table"},
  "properties": {
    "x": {
      "field": "x",
      "scale": "x"
    },
    "y": {
      "field": "y",
      "scale": "y"
    },
    "strokeColor": {
      "scale": "strokeColor",
      "field": "color"
    },
    "strokeWidth": 2,
    "lineJoin": "miter",
    "miterLimit": 10
  }
}

polys Type

polys Example

const exampleVega = {
  "width": 1004,
  "height": 336,
  "data": [
    {
      "name": "polys",
      "format": "polys",
      "sql": "SELECT ... elided ..."
    }
  ],
  "scales": [ ... elided ... ]
  "marks": [
    {
      "type": "polys",
      "from": {
        "data": "polys"
      },
      "properties": {
        "x": {
          "scale": "x",
          "field": "x"
        },
        "y": {
          "scale": "y",
          "field": "y"
        },
        "fillColor": {
          "scale": "polys_fillColor",
          "field": "avgContrib"
        },
        "strokeColor": "white",
        "strokeWidth": 0,
        "lineJoin": "miter",
        "miterLimit": 10
      }
    }
  ]
}

symbol Type

Currently, in symbol mark types, strokes are not visible beneath other marks, regardless of opacity settings.

Specify x and y coordinate values using either constants or domain and range values of a data reference. If the from property is not specified, the x and y properties fields must be specified using constant values.

`symbol` Examples

const exampleVega = {
  "width": 733,
  "height": 530,
  "data": [
    {
      "name": "heatmap_query",
      "sql": "SELECT ... elided ... "
    }
  ],
  "scales": [ ... elided ... ],
  ],
  "marks": [
    {
      "type": "symbol",
      "from": {
        "data": "heatmap_query"
      },
      "properties": {
        "shape": "square",
        "x": { "field": "x" },
        "y": { "field": "y" },
        "width": 1,
        "height": 1,
        "fillColor": { "scale": "heat_color", "field": "cnt" }
      }
    }
  ]
};

The following example defines symbol mark types including fill, stroke, and general opacity properties:

{
  "width" : 1024,
  "height" : 1024,
  "data": [
      {
          "name" : "table",
          "values": [
              {"x": 200,  "x2": 0.0, "y": 200.0, "y2": 0.0, "val" : 0, "color" : "red", "color2": "yellow", "opacity": 1.0, "fillOpacity":0.75, "strokeOpacity": 0.25},
              {"x": 220.806,  "x2": 0.0, "y": 263.75, "y2": 0.0, "val" : 1, "color" : "blue", "color2": "green", "opacity": 0.5, "fillOpacity": 0.5, "strokeOpacity": 0.5},
              {"x": 240.61216,  "x2": 0.0, "y": 327.5, "y2": 0.0, "val" : 0, "color" : "maroon", "color2": "magenta", "opacity": 0.1, "fillOpacity": 0.25, "strokeOpacity": 0.75}
          ]
      }
  ],
  "marks" : [
      {
          "type" : "symbol",
          "from" : {"data" : "table"},
          "properties" : {
                      "shape" : "circle",
              "xc" : { "field" : "x" },
              "yc" : { "field" : "y" },
                      "width": 150.0,
                  "height": 150.0,
              "opacity": 0.9,
              "fillOpacity": {
                  "field": "fillOpacity"
              },
              "fillColor" : {
                  "field": "color2"
              },
                      "strokeWidth" : 10.0,
                      "strokeColor" : {
                  "field": "color"
              },
              "strokeOpacity": {
                  "field": "strokeOpacity"
              }
          }
      }
   ]
}

from

The from field specifies the input database table to use.

Example

Use the tweets database table for marks input data.

vegaSpec = {
    "width": 384,
    "height": 564,
    "data": [
        {
          "name": "tweets",
          "sql": "SELECT ... elided ... "
        }
    ],
    "scales": [ ... elided ... ],
    "marks": [
        {
            "type": "polys",
            "from": {"data": "tweets"},
            "properties": { ... elided ... }
        }
    ]
};

If from is not specified, the data source is implicitly a single point with the value defined in the points properties.

properties

Typically, a single mark instance is generated per input data element, except for polys, which uses multiple data elements to represent a line or area shape.

The following table describes the various marks properties and lists the types for which the property is valid.

Value Reference

A value reference describes how to specify marks properties values. The value can be a constant or data object reference:

Examples:

Statically set the point fillColor and size.

"marks:" [
  {
    "type:" "points",
    "from:" {
      "data:" "tweets"
    },
    "properties:" {

         ... elided ...

      "fillColor": "blue",
      "size": 3
      }
    }
  }
]

For the x marks property, apply the x scale transform to the implicit x-coordinate data column.

"marks": [
  {
    "type": "polys",
    "from": {
      "data": "polys"
    },
    "properties": {
      "x": {
        "scale": "x",
        "field": "x"
      },

      ... elided ...

    }
  }
]

Field Reference

A field reference is either a string literal or an object. For object values, the following properties are supported:

Color Value Reference

Typically, color values are specified as a single RGB color value. To specify specific color fields or use a different color space, use one of the following color value reference formats:

Examples

Set the red and blue channels of an RGB color as constants, and uses a scale transform to determine the green channel:

"fill": {
  "r": {"value": 255},
  "g": {"scale": "green", "field": "g"},
  "b": {"value": 0}
}

Use the rgb color space for the color field:

"fillColor": {
    "field": "color",
    "colorSpace": "rgb"
}

transform

},
"transform": {
      "<key>": "<value>"
}

The value references an existing Vega object by name.

For example, the following transform references the projection my_mercator_projection defined in the top-level Vega projections property.

"projections": [
{
  "name": "my_mercator_projection",
  "type": "mercator",
  "bounds": {
    "x": [-120.0, 120.0],
    "y": [-20.0, 20.0]
  }
}
]
"marks": [
{
  "type": "symbol",
  "from": { "data": "fec_contributions_oct" },
  "properties": { ... elided ... }
  "transform": {
    "projection": "my_mercator_projection"
  }
}
]

Currently, the only supported transform is projection.

Migration

Migrating Vega Code to Dynamic Poly Rendering

In Release 5.2, the polygon cache for rendering was deprecated and will be completely removed in a subsequent release. Any poly cache rendering in your code must be reworked to use dynamic poly rending. This topic describes how to migrate poly cache code to dynamic, cacheless rendering.

Caching poly buffers has two main drawbacks, both of which have a significant impact on memory:

The cache cannot span multiple GPUs.
The entire table is cached, regardless of the filter in the query.

In contrast, dynamic poly rendering can utilize all available GPUs and only uses the data that passes any filters.

To move to dynamic poly rendering, determine if you are using the poly cache, and then adjust your code if needed.

HEAVY.AI strongly recommends that you also remove the render-poly-cache-bytes option from your server configuration file, if used. This will help prevent startup warnings or errors in subsequent releases of HEAVY.AI.

Am I Using the Poly Cache?

It may not be immediately obvious if you are using poly cache rendering because no flag is used to enable it. Instead, poly cache rendering is enabled according to the SQL code used in a poly-formatted data block of Vega code. If the query ultimately projects or results in a POLYGON/MULTIPOLYGON column, the cache is not used and no code changes are requried.

However, if the query does not reference a POLYGON/MULTIPOLYGON column, but projects a rowid column, then poly caching is in use.

The following Vega code has a poly query that uses the cache system:

...,
"data": [ 
    {
      "name": "polys", 
      "format": "polys", 
      "sql": "SELECT rowid from zipcodes WHERE ST_XMax(heavyai_geo) > -160.0 AND ST_XMin(heavyai_geo) <
              160.0 AND ST_YMax(heavyai_geo) > -60.0 AND ST_YMin(heavyai_geo) < 60.0"
     }
   ], 
...

The sql property of the polys data block, which uses "format": "polys", projects a rowid column. This activates poly caching, even though the geo column (heavyai_geo in this case) is used in the filter.

To convert this Vega code to dynamic, cacheless rendering, change the SQL query:

"sql": "SELECT heavyai_geo from zipcodes WHERE ST_XMax(heavyai_geo) > -160.0 AND ST_XMin(heavyai_geo) < 
        160.0 AND ST_YMax(heavyai_geo) > -60.0 AND ST_YMin(heavyai_geo) < 60.0"

Alternatively, you can check the INFO logs to see if poly caching is used. Look for a LOG statement similar to the following:

Caching <number> bytes (<number> for vbo/ibo) for poly query: <sql> on gpu <number>
Poly cache on gpu <number> now using <number> of <number> bytes.

Cannot cache <number> bytes (<number> for vbo/ibo) for poly query: <sql> on gpu <number>. 
There is currently <number> of <number> total bytes used in the poly cache.

If you find either of these, then a poly cache render query is used.

Migrating Your Code

To migrate from poly cache to dynamic poly rendering, you output a POLYGON/MULTIPOLYGON column in the SQL query of a poly-formatted data block of your Vega code. In the following examples, the heavyai_geo column is a MULTIPOLYGON.

Example 1 - Simple projection query

Cached (only rowid is output):

..., 
"data": [
     {
       "name": "polys",
       "format": "polys",
       "sql": "SELECT rowid from zipcodes WHERE ST_XMax(heavyai_geo) > -160.0 AND ST_XMin(heavyai_geo) < 
               160.0 AND ST_YMax(heavyai_geo) > -60.0 AND ST_YMin(heavyai_geo) < 60.0"
     }
   ],
 ...

Dynamic (heavyai_geo is now projected, along with rowid):

..., 
"data": [
     {
       "name": "polys",
       "format": "polys",
       "sql": "SELECT heavyai_geo, rowid from zipcodes WHERE ST_XMax(heavyai_geo) > -160.0 AND ST_XMin(heavyai_geo) < 
               160.0 AND ST_YMax(heavyai_geo) > -60.0 AND ST_YMin(heavyai_geo) < 60.0"
     }
   ], 
...

Example 2 - Join query using a WITH subquery

Cached (no geo column is projected in the outer query of the join):

...,
"data": [ 
     {
       "name": "choropleth",    
       "format": "polys",
       "sql": "WITH colors AS (SELECT pickup_building as key0, avg(tip_amount/fare_amount) as color 
               FROM taxi_factual_closestbuilding WHERE (fare_amount > 0 AND tip_amount >= 0 AND tip_amount/fare_amount<1) 
               GROUP BY key0) SELECT nyc_buildings.rowid, colors.key0 as key0, colors.color as color 
               FROM colors, nyc_buildings WHERE (colors.key0 = nyc_buildings.bld_id) AND 
               (ST_XMax(nyc_buildings.heavyai_geo) >= -74.06993970987278 AND 
               ST_XMin(nyc_buildings.heavyai_geo) <= -73.87446838332451 AND 
               ST_YMax(nyc_buildings.heavyai_geo) >= 40.64111330399987 AND 
               ST_YMin(nyc_buildings.heavyai_geo) <= 40.87934187208219)"
     }
   ],
...

Dynamic (the geo column is projected in place of rowid):

...,
"data": [
     {
       "name": "backendChoropleth",
       "format": "polys",
       "sql": "WITH colors AS (SELECT pickup_building as key0, avg(tip_amount/fare_amount) as color 
               FROM taxi_factual_closestbuilding WHERE (fare_amount > 0 AND tip_amount >= 0 AND tip_amount/fare_amount<1) 
               GROUP BY key0) SELECT nyc_buildings.heavyai_geo as heavyai_geo, colors.key0 as key0, colors.color as color 
               FROM colors, nyc_buildings WHERE (colors.key0 = nyc_buildings.bld_id) AND 
               (ST_XMax(nyc_buildings.heavyai_geo) >= -74.06993970987278 AND 
               ST_XMin(nyc_buildings.heavyai_geo) <= -73.87446838332451 AND 
               ST_YMax(nyc_buildings.heavyai_geo) >= 40.64111330399987 AND 
               ST_YMin(nyc_buildings.heavyai_geo) <= 40.87934187208219)"
     }
   ], 
...

Try Vega

You can try the HEAVY.AI Vega engine and work with various examples. The database used for the examples includes over 100,000,000 rows. The data is based on the . Some columns have been stripped so that the data can fit on one NVIDIA Tesla P100 GPU.

The following examples are available.

Vega

Quick Links

Additional Resources

Standards

Related Vega Implementations

Vega Concepts

OmniSci APIs

Vega Tutorials

Tutorial Framework

Finding the Cause of Errors

Available Tutorials

Vega at a Glance

Defining the Source Data

Creating a Visualization using Vega

Define the Visualization Area Dimensions

Define the Marks

Scale Input Domain to Output Range

Connecting to the Server and Rendering the Visualization

Next Steps

Source Code

HTML

Getting Started with Vega

Step 1 - Create the Vega Specification

Specify the Visualization Area

Specify the Data Source

Specify the Graphical Properties of the Rendered Data Item

Specify How Input Data are Scaled to the Visualization Area

Step 2 - Connect to the Backend

Step 3 - Make the Render Request and Handle the Response

Source Code

HTML

JavaScript

Getting More from Your Data

Source Code

HTML

JavaScript

Creating More Advanced Charts

Source Code

HTML

JavaScript

Using Polys Marks Type

Source Code

HTML

JavaScript

Vega Accumulator

Density Mode

Blend Mode

Percentage Mode

Using Transform Aggregation

Rendering a Heatmap Colored by Bin Statistics

SQL Queries

Render the Heatmap in Vega

Creating a Geo Pointmap Using Transforms to Drive Color and Size

Distinct

Median

Quantile

Improving Rendering with SQL Extensions

Current Limitations

Vega Reference Overview

Specification Language Syntax

Format Rules

data Property

Examples

Data Properties

name

format

Data Source

transform

enableHitTesting

projections Property

Example

scales Property

Example

Scales Properties

name

type

Quantitative Scales

Discrete Scales

Discretizing Scales

domain

`symbol` Examples