Measures and Dimensions
Charts comprise measures and dimensions. Measures are values or aggregated values. Dimensions categorize measures. The measure and dimension requirements for each chart type are discussed in Heavy Immerse Chart Types.
Measures
Typically, measures are calculated fields such as SUM, AVERAGE, and so on. Immerse creates charts by plotting these calculated values. This lets you visually compare measures of one dimension with another.
If you select a numerical column as a measure, you can aggregate that column as AVERAGE, MIN, MAX, SUM, #UNIQUE, STDDEV, or SAMPLE. If you select a string column (that is, a column of text values), Immerse automatically aggregates by #UNIQUE, which returns the count of distinct strings in that column.
For optimal performance across very large data sets, #UNIQUE uses APPROX_COUNT_DISTINCT. COUNT(DISTINCT x), especially when used with GROUP BY, can require a very large amount of memory to hold all distinct values in large tables with large cardinalities. You can create a custom SQL calculation using COUNT(DISTINCT x), but performance will be negatively impacted. For more information, see statistical and aggregate function Usage Notes.
SAMPLE is different from other aggregate choices. Ideally, you use sample when there is one and only one value in the measure column that corresponds with the selected dimension. In that scenario, sample greatly improves the performance of your query. If there are multiple measure values, the results are likely to vary each time the chart redraws.
Depending on the chart, Immerse can visualize anywhere from one to four measures. A Pie chart, for example, offers 2 measures, Size and Color. Whichever column you choose as the size measure is used to determine the size of the slices in the pie chart. Color is an optional second measure for Pie Chart, allowing another layer of information to be visualized on the same chart. For example, you can size the Pie’s slices by the number of sales opportunities that are closed, and color them by the average income from those sales.
A more complex chart, such as a scatter plot, can concurrently visualize up to four measures. Scatter plots show values on a two-dimensional matrix, based on an X measure and a Y measure. Additionally, the dots used in the scatterplot can be sized and colored by two additional measures, for a total of four measures.
Special Behavior with Table Charts, for Measures
Table charts are a nongraphical, row/column presentation of raw data. Measures can behave differently than with other chart types. If you create a table chart with measures but no dimensions, the data is not grouped; the chart presents raw information at the row level from the database. If you create a table chart with dimensions, the measures act in their usual way, as an aggregate calculation. If you want to view raw, row-level information from the database without performing any transformations or calculations, create a table chart with measures but no dimensions.
Custom Measures
In addition to normal measures, which perform simple aggregation calculations on data, you can create custom measures that perform aggregations for any HEAVY.AI-supported SQL. For information about HEAVY.AI SQL support, see HEAVY.AI SQL Capabilities.
The custom measure can be any aggregate statement that can be substituted for the term COUNT(column1)
in the following SQL statement.
To create a Custom Measure:
Create or edit a chart.
From the Add Measure pop-up, choose Custom SQL Measure.
Enter an Alias for the custom measure.
Enter the SQL aggregation clause in the Custom Measure field.
Click OK.
For example, if you have a table with Price, Quantity, and Tax columns, you can compute the Subtotal using the custom SQL clause (Quantity * Price) + (Quantity * Price * Tax)
.
You can check to see which products bring in the most average gross sales by finding the average of the Price
.
Because most Immerse charts require dimensions, which group the data, custom measures for those charts must always be written as aggregate statements, as in the examples above. However, for charts that do not require dimensions, such as a point map or table, you can write custom measures that are not aggregate statements.
Dimensions
Dimensions are the grouped columns in a query. For example, if a table lists every car sold in the United States, and you want to display the number of cars sold per manufacturer, the dimension is car manufacturer (that is, the query results are grouped by manufacturer).
A query can have more than one dimension. For example, the number of cars sold by car manufacturer, by state. When a chart has multiple dimensions, the dimensions are separated by forward slashes (/). For example, the car manufacturer by state would display “Chevrolet / California.” Table Charts display multiple dimensions in separate table columns. You can change the order of dimensions by dragging and dropping the dimensions.
Certain chart types have restrictions on the types of dimensions you can use. For example, histograms and line charts can only have numerical dimensions. Choropleths can only have dimensions that reflect geographic regions. Pointmaps have no dimensions, since they display geographic points at the latitude/longitude level only. Number charts have no dimensions, since their purpose is to present only a single number.
Binned Dimensions
When a dimension is set to a numerical column, Immerse presents information grouped by each number. If there are many numbers, Immerse automatically creates binned ranges of numbers. For example, if you have a table with many distinct numbers from 1 to 1 million, a display of 10 bins would be 1 - 100,000, 100,001 - 200,000, and so on.
When you display TIMESTAMP dimensions, you can bin on milliseconds if, when binning by millisecond, there would be fewer than 1000 bins. Otherwise, binning defaults to the smallest unit of time that provides fewer than 1000 bins.
You can manually disable automatic binning to force the display of all numbers in the dimension. Depending on the expense of the query, Immerse might display a notification that the unbinned query is not available.
Null Dimensions
Many times, measures have no entry for a dimension. A null value might or might not be significant. For the measure Baggage_Fee, a null value might be considered “no fee collected,” and could be useful information. For the measure Departure_City, a null value is spurious data, because an airline flight has to take off from somewhere. Spurious data can skew the accuracy of your chart.
You can choose to omit null dimensions from Table, Bar, Bubble, and Heatmap charts to visualize only existing data values.
Custom Dimensions
You can create custom dimensions that group query results by using HEAVY.AI supported SQL. Consider the following standard SQL query. You can create a custom dimension using any statement that could replace the BY
clause in this SQL query:
To enter a Custom Dimension:
Create or edit a chart.
From the Add Dimension pop-up, choose Custom SQL Dimension.
Enter an Alias for your custom dimension.
Enter the contents of a
BY
clause from a SQL query into the Custom Dimension field.Click OK.
For example, looking at a dataset of voters from the 2016 election, you might want to do some demographic analysis by gender. You can build your BY
clause by adding the Gender column name. This is no different from setting the dimension to Gender, but it has the bonus side effect of changing the title of the column in your chart.
You might want to express the results as how many voters identified themselves as female. The custom SQL clause is 'f'
. The Alias is Women Voters. The information is presented as true if the gender is female, or false if the gender is not.
You can create a new dimension combining information from two or more columns. For example, you can use a case
statement to define a custom dimension that groups by both gender and party affiliation.
You can also use the case
statement to create a funnel that aggregates data at an increasing level of specificity. For example, this SQL clause creates a chart that shows the number of UFO sightings based on an accumulation of common factors.
Filter On Aggregate
Pointmap charts allow you to place an additional filter on the values grouped by dimension. When you add a Dimension to your chart, the Filter On Aggregate field displays. Choose a field on which to filter your data (or create a custom dimension), then add the filter criteria. Only the records that meet your criteria are plotted on the chart.
For example, this Pointmap shows the origin points of flights that experienced a weather delay.
If you are not concerned with trivial delays, you can filter the aggregated results to show only the delays greater than 30 minutes.
Customizing Measure and Date Formats
You can customize the format of numbers and dates in charts, making it easier to read and understand the values in charts and on axis labels, tooltips, and other chart elements. For example, you can format measures and date dimensions to:
Indicate a particular currency or time format
Denote numerical precision
Display the meaningful part of a date–for example, just the year
You can also add prefixes and suffixes to further define the formats.
To customize measure and date dimension formats on your chart, follow these steps:
1) Create or edit a chart.
2) For every numerical measure or date dimension selected for the chart, a formatting input box is added to the right-hand configuration panel. For example, the chart below shows one numerical measure (SUM Sales) and one date dimension (AUTO Order_Date):
3) In the Formatting area, click the Dimension Date Formatting or the Measure Number Formatting input box for the measure or dimension format that you want to customize.
4) The drop-down list for the field shows the most Common Number Formats and Common Date Formats. Select an available format in the drop-down list, or enter your own, following the d3-format syntax for numbers or syntax for dates.
In addition, the following extensions are available for number formats:
prefix {[format]} suffix
- Prepend or append the prefix and suffix to the formatted value, using the format defined inside the curly braces ({ }).[format]|k'
- Format using International System of Units (SI) prefix.
Common Number Formats
Description | Format String | Example |
Plain English suffix | B | 12B |
Two decimals, thousands separator | ,.2f | 1,234.57 |
Zero decimals, thousands separator | ,.0f | 1,235 |
SI suffix | .2s | .2s |
Dollar currency | -$.2f | -$1234.57 |
Dollar currency, negative values in parentheses | ($.2f | ($1234.57) |
SI suffix, forcing to a specific suffix | ,.2s|k | 1,234.57k |
Percentage, multiplying by 100 (for example, to convert 0.15 to 15%) | ,.2% | 123,457.00% |
Add percent suffix | {,.2f}% | 1,234.57% |
Add a suffix | {,.2f}km/h | 1,234.57km/h |
Common Date Formats
Description | Format String | Example |
year-month-day | %Y-%m-%d | 2018-04-12 |
month/day/year | %m/%d/%Y | 04/12/2018 |
Hours:Minutes:Seconds | %H:%M:%S | 15:46:25 |
Hours:Minutes:Seconds.Milliseconds | %H:%M:%S.%L | 15:46:25.740 |
Month day, year | %B %d, %Y | April 12, 2018 |
Year | %y | 18 |
Month | %B | April |
Day of the Week | %A | Thursday |
Date, Time | %c | 4/12/2018, 3:48:44 PM |
Milliseconds | %Lms | 740ms |
Hours:Minutes:Seconds:Milliseconds | %H:%M:%S:%L | 15:46:25:740 |
Examples
Line Chart Date and Numbers
The following Line chart shows sales by order date:
To make the chart easier to read, apply the format %m/%d/%y to dimension date formatting and SI to measure number formatting:
Bar Chart Percentage
The following Bar chart shows profit ratio percentage to six decimal places, with % applied for Custom Profit Ratio:
To simplify the percentage format to one decimal place, apply ,.1% to the Custom Profit Ratio:
Bar Chart Currency
In the following Bar chart, the average profit by state is expressed in whole dollars, both positive and negative, by using the ($0.f format in the Avg Profit input box:
Last updated