HEAVY.AI Docs
v7.2.4
v7.2.4
  • Welcome to HEAVY.AI Documentation
  • Overview
    • Overview
    • Release Notes
  • Installation and Configuration
    • System Requirements
      • Hardware Reference
      • Software Requirements
    • Installation
      • Free Version
      • Installing on CentOS
        • HEAVY.AI Installation on CentOS/RHEL
        • Install NVIDIA Drivers and Vulkan on CentOS/RHEL
      • Installing on Ubuntu
        • HEAVY.AI Installation on Ubuntu
        • Install NVIDIA Drivers and Vulkan on Ubuntu
      • Installing on Docker
        • HEAVY.AI Installation using Docker on Ubuntu
      • Getting Started on AWS
      • Getting Started on GCP
      • Getting Started on Azure
      • Getting Started on Kubernetes (BETA)
      • Upgrading
        • Upgrading HEAVY.AI
        • Upgrading from Omnisci to HEAVY.AI 6.0
        • CUDA Compatibility Drivers
      • Uninstalling
      • Ports
    • Services and Utilities
      • Using Services
      • Using Utilities
    • Executor Resource Manager
    • Configuration Parameters
      • Overview
      • Configuration Parameters for HeavyDB
      • Configuration Parameters for HEAVY.AI Web Server
    • Security
      • Roles and Privileges
      • Connecting Using SAML
      • Implementing a Secure Binary Interface
      • Encrypted Credentials in Custom Applications
      • LDAP Integration
    • Distributed Configuration
  • Loading and Exporting Data
    • Supported Data Sources
      • Kafka
      • Using Heavy Immerse Data Manager
      • Importing Geospatial Data
    • Command Line
      • Loading Data with SQL
      • Exporting Data
  • SQL
    • Data Definition (DDL)
      • Datatypes
      • Users and Databases
      • Tables
      • System Tables
      • Views
      • Policies
    • Data Manipulation (DML)
      • SQL Capabilities
        • ALTER SESSION SET
        • ALTER SYSTEM CLEAR
        • DELETE
        • EXPLAIN
        • INSERT
        • KILL QUERY
        • LIKELY/UNLIKELY
        • SELECT
        • SHOW
        • UPDATE
        • Arrays
        • Logical Operators and Conditional and Subquery Expressions
        • Table Expression and Join Support
        • Type Casts
      • Geospatial Capabilities
        • Uber H3 Hexagonal Modeling
      • Functions and Operators
      • System Table Functions
        • generate_random_strings
        • generate_series
        • tf_compute_dwell_times
        • tf_feature_self_similarity
        • tf_feature_similarity
        • tf_geo_rasterize
        • tf_geo_rasterize_slope
        • tf_graph_shortest_path
        • tf_graph_shortest_paths_distances
        • tf_load_point_cloud
        • tf_mandelbrot*
        • tf_point_cloud_metadata
        • tf_raster_contour_lines; tf_raster_contour_polygons
        • tf_raster_graph_shortest_slope_weighted_path
        • tf_rf_prop_max_signal (Directional Antennas)
        • ts_rf_prop_max_signal (Isotropic Antennas)
        • tf_rf_prop
      • Window Functions
      • Reserved Words
      • SQL Extensions
  • Heavy Immerse
    • Introduction to Heavy Immerse
    • Admin Portal
    • Control Panel
    • Working with Dashboards
      • Dashboard List
      • Creating a Dashboard
      • Configuring a Dashboard
      • Duplicating and Sharing Dashboards
    • Measures and Dimensions
    • Using Parameters
    • Using Filters
    • Using Cross-link
    • Chart Animation
    • Multilayer Charts
    • SQL Editor
    • Customization
    • Joins (Beta)
    • Chart Types
      • Overview
      • Bar
      • Bubble
      • Choropleth
      • Combo
      • Cross-Section
      • Contour
      • Gauge
      • Geo Heatmap
      • Heatmap
      • Histogram
      • Line
      • Linemap
      • New Combo
      • Number
      • Pie
      • Pointmap
      • Scatter Plot
      • Skew-T
      • Stacked Bar
      • Table
      • Text Widget
      • Wind Barb
  • HeavyRF
    • Introduction to HeavyRF
    • Getting Started
    • HeavyRF Table Functions
  • HeavyConnect
    • HeavyConnect Release Overview
    • Getting Started
    • Best Practices
    • Examples
    • Command Reference
    • Parquet Data Wrapper Reference
    • ODBC Data Wrapper Reference
  • HeavyML (BETA)
    • HeavyML Overview
    • Clustering Algorithms
    • Regression Algorithms
      • Linear Regression
      • Random Forest Regression
      • Decision Tree Regression
      • Gradient Boosting Tree Regression
    • Principal Components Analysis
  • Python / Data Science
    • Data Science Foundation
    • JupyterLab Installation and Configuration
    • Using HEAVY.AI with JupyterLab
    • Python User-Defined Functions (UDFs) with the Remote Backend Compiler (RBC)
      • Installation
      • Registering and Using a Function
      • User-Defined Table Functions
      • RBC UDF/UDTF Example Notebooks
      • General UDF/UDTF Tutorial Notebooks
      • RBC API Reference
    • Ibis
    • Interactive Data Exploration with Altair
    • Additional Examples
      • Forecasting with HEAVY.AI and Prophet
  • APIs and Interfaces
    • Overview
    • heavysql
    • Thrift
    • JDBC
    • ODBC
    • Vega
      • Vega Tutorials
        • Vega at a Glance
        • Getting Started with Vega
        • Getting More from Your Data
        • Creating More Advanced Charts
        • Using Polys Marks Type
        • Vega Accumulator
        • Using Transform Aggregation
        • Improving Rendering with SQL Extensions
      • Vega Reference Overview
        • data Property
        • projections Property
        • scales Property
        • marks Property
      • Migration
        • Migrating Vega Code to Dynamic Poly Rendering
      • Try Vega
    • RJDBC
    • SQuirreL SQL
    • heavyai-connector
  • Tutorials and Demos
    • Loading Data
    • Using Heavy Immerse
    • Hello World
    • Creating a Kafka Streaming Application
    • Getting Started with Open Source
    • Try Vega
  • Troubleshooting and Special Topics
    • FAQs
    • Troubleshooting
    • Vulkan Renderer
    • Optimizing
    • Known Issues and Limitations
    • Logs and Monitoring
    • Archived Release Notes
      • Release 6.x
      • Release 5.x
      • Release 4.x
      • Release 3.x
Powered by GitBook
On this page
  • Creating a Topic
  • Using the Producer
  • Using the Consumer
Export as PDF
  1. Loading and Exporting Data
  2. Supported Data Sources

Kafka

Apache Kafka is a distributed streaming platform. It allows you to create publishers, which create data streams, and consumers, which subscribe to and ingest the data streams produced by publishers.

You can use HeavyDB KafkaImporter C++ program to consume a topic created by running Kafka shell scripts from the command line. Follow the procedure below to use a Kafka producer to send data, and a Kafka consumer to store the data, in HeavyDB.

This example assumes you have already installed and configured Apache Kafka. See the Kafka website.

Creating a Topic

Create a sample topic for your Kafka producer.

  1. Run the kafka-topics.sh script with the following arguments:

    bin/kafka-topics.sh --create --zookeeper localhost:2181 --replication-factor 1
    --partitions 1 --topic matstream
  2. Create a file named myfile that consists of comma-separated data. For example:

    michael,1
    andrew,2
    ralph,3
    sandhya,4
  3. Use heavysql to create a table to store the stream.

    create table stream1(name text, id int);

Using the Producer

Load your file into the Kafka producer.

  1. Create and start a producer using the following command.

    cat myfile | bin/kafka-console-producer.sh --broker-list localhost:9097
    --topic matstream

Using the Consumer

Load the data to HeavyDB using the Kafka console consumer and the KafkaImporter program.

  1. Pull the data from Kafka into the KafkaImporter program.

    /home/heavyai/build/bin/KafkaImporter stream1 heavyai -p HyperInteractive -u heavyai --port 6274 --batch 1 --brokers localhost:6283  
    --topic matstream --group-id 1
    
    Field Delimiter: ,
    Line Delimiter: \n
    Null String: \N
    Insert Batch Size: 1
    1 Rows Inserted, 0 rows skipped.
    2 Rows Inserted, 0 rows skipped.
    3 Rows Inserted, 0 rows skipped.
    4 Rows Inserted, 0 rows skipped.
  2. Verify that the data arrived using heavysql.

    heavysql> select * from stream1;
    name|id
    michael|1
    andrew|2
    ralph|3
    sandhya|4
PreviousSupported Data SourcesNextUsing Heavy Immerse Data Manager

Last updated 3 years ago