data Property
Use the Vega data
property to specify the visualization data sources by providing an array of one or more data definitions. A data definition must be an object identified by a unique name, which can be referenced in other areas of the specification. Data can be statically defined inline ("values":
), can reference columns from a database table using a SQL statement ("SQL":
), or can be loaded from an existing data set ("source":
).
JSON format:
The data specification has the following properties:
Property
Data Type
Required
Description
string
X
User-assigned database table name.
string/object
How the data are parsed. polys
and lines
are the only supported format
mark types and are for rendering purposes only. Use the single string "short form" for polygon and simple linestring renders. Use the JSON object "long form" to provide more information for rendering more complex line types.
string
Data source:
values
: Embedded, static data values defined inline as JSON.
sql
: A SQL query that loads the data.
string
An array of transforms to perform on the input data. The output of the transform pipeline then becomes the value of this data set. Currently, can only be used with source
data set types.
boolean
If true, automatically adds rowid column(s) to the SQL statement, which is required for hit-testing using the get_result_row_for_pixel
endpoint.
Examples
Load discrete x- and y column values using the values
database table type:
Use the sql
database table type to load latitude and longitude coordinates from the tweets_data
database table:
Use the source
type to use the data set defined in the sql
data section and perform aggregation transforms:
Data Properties
name
The name
property uniquely identifies a data set, and is used for reference by other Vega properties, such as the Marks property.
format
The format
property indicates that data preprocessing is needed before rendering the query result. If this property is not specified, data is assumed to be in row-oriented JSON format.
This property is required for Polys and Lines mark types. The property has one of two forms:
The "short form", where
format
is a single string, which must be eitherpolys
orlines
. This form is used for all polygon rendering, and for fast ‘in-situ’ rendering of LINESTRING data.The "long form", where
format
is an object containing other properties, as follows:
Format Property
Description
type
Marks property type:
coords
Applies to type: lines
.
Specifies x
and y
arrays, which must both be the same size.
This permits column extraction pertaining to line rendering and place them in a rendering buffer. The coords
property also dictates the ordering of points in the line.
Separate x- and y-array columns are also supported.
layout
(optional) Applies to type: lines
.
Specifies how vertices are packed in the vertices column. All arrays must have the same layout:
interleaved
: (default) All elements corresponding to a single vertex are ordered in adjacent pairs. For example, x0, y0, x1, y1, x2, y2.sequential
: All elements of the same axis are adjacent. For example, x0, x1, x2, y0, y1, y2.
For lines
, each row in the query corresponds to a single line.
This lines format
example of interleaved
data renders ten lines, all of the same length.
In this lines format
example of sequential
data, x
only stores points corresponding to the x coordinate and y
only stores points corresponding to the y coordinate. Make sure that columns only contain a single coordinate if using multiple columns in sequential layout.
The following example shows a fast "in-situ" LINESTRING format
:
The following example shows a polys format
:
Data Source
The database table source property key-value pair specifies the location of the data and defines how the data is loaded:
Key
Value
Description
source
String
Data is loaded from an existing data set.
sql
SQL statement
Data is loaded using a SQL statement.
values
JSON data
Data is loaded from static, key-value pair data definitions.
transform
Transforms process a data stream to calculate new aggregated statistic fields and derive new data streams from them. Currently, transforms are specified only as part of a source
data definition. Transforms are defined as an array of specific transform types that are executed in sequential order. Each element of the array must be an object and must contain a type
property. Currently, two transform types are supported: aggregate
and formula
.
Type
Description and Properties
aggregate
Performs aggregation operations on input data columns to calculate new aggregated statistic fields and derive new data streams from them. The following properties are required:
fields
: An array of strings referencing columns from the sourced data table.
ops
: An array of keyword strings and objects indicating the predefined operation to perform. For objects, the type
property is required to name the type of the aggregation function. Supported operators:
count
: The total count of data objects in the group.countdistinct
: The number of distinct values in an input data column; operates only on numeric or dictionary-encoded string columns.distinct
: An array of distinct values from an input data column; operates only on numeric or dictionary-encoded string columns.max
: The maximum field value.mean / average / avg
: The mean (average) field value.median
: The median of an input data column; operates only on numeric columns.min
: The minimum field value.missing
: The count of field values that are null or undefined.quantile
: An array of quantile separators; see https://en.wikipedia.org/wiki/Quantile. Operates only on numeric columns:numQuantiles
: The number of contiguous intervals to create; returns the separators for the intervals. The number of separators equalsnumQuantiles - 1
. Range: 1-100. Default: 4includeExtrema
: Whether to include min and max values (extrema) in the resulting separator array. Whentrue
, the resulting array size isnumQuantiles
+ 1. Values:true
orfalse
. Default: false
sum
: The sum of field values.stddev
: The sample standard deviation of field values.stddevp
: The population standard deviation of field values.valid
: The count of field values that are not null nor undefined.variance
: The sample variance of field values.variancep
: The population variance of field values.
as
: An array of strings used as output names of the operations for later reference.
formula
Evaluates a user-defined expression. The following properties are required:
as
: A string used as an output name for later reference.
Note: Currently, expressions can only be performed against outputs (as values) from prior aggregate transforms.
See Tutorial: Using Transforms for more detailed examples.
enableHitTesting
If true
, automatically adds rowid column(s) to the SQL statement where appropriate, enabling the data block for hit-testing using the get_result_row_for_pixel
endpoint.
If false
, the data block is not automatically hit-test enabled, and any later get_result_row_for_pixel
calls return empty hit-test results.
If the enableHitTesting property is not present, the following legacy behavior is used as the default:
If the SQL statement represents a projection query, hit-testing is enabled if a rowid column is explicitly projected.
If the SQL statement represents an aggregate query, hit-testing is always enabled.
This legacy behavior will likely be deprecated and removed in an upcoming version of OmniSci. At that point, the enableHitTesting property will be required for activating hit-test support for the data.
Last updated