# Tables

These functions are used to create and modify data tables in HEAVY.AI.

## Nomenclature Constraints

Table names must use the NAME format, described in [regex](https://en.wikipedia.org/wiki/Regular_expression) notation as:

```
[A-Za-z_][A-Za-z0-9\$_]*
```

Table and column names can include quotes, spaces, and the underscore character. Other special characters are permitted if the name of the table or column is enclosed in double quotes (" ").

{% hint style="warning" %}

* Spaces and special characters other than underscore (\_) cannot be used in Heavy Immerse.
* Column and table names enclosed in double quotes cannot be used in Heavy Immerse
  {% endhint %}

## CREATE TABLE

```sql
CREATE [TEMPORARY] TABLE [IF NOT EXISTS] <table>
  (<column> <type> [NOT NULL] [DEFAULT <value>] [ENCODING <encodingSpec>],
  [SHARD KEY (<column>)],
  [SHARED DICTIONARY (<column>) REFERENCES <table>(<column>)], ...)
  [WITH (<property> = value, ...)];
```

Create a table named `<table>` specifying `<columns>` and table properties.

### Supported Datatypes

| Datatype                  | Size (bytes) | Notes                                                                                                                                                                                                                                                                                                                                                                                                          |
| ------------------------- | ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `BIGINT`                  | 8            | Minimum value: `-9,223,372,036,854,775,807`; maximum value: `9,223,372,036,854,775,807`.                                                                                                                                                                                                                                                                                                                       |
| `BOOLEAN`                 | 1            | TRUE: `'true'`, `'1'`, `'t'`. FALSE: `'false'`, `'0'`, `'f'`. Text values are not case-sensitive.                                                                                                                                                                                                                                                                                                              |
| `DATE`**\***              | 4            | Same as `DATE ENCODING DAYS(32)`.                                                                                                                                                                                                                                                                                                                                                                              |
| `DATE ENCODING DAYS(32)`  | 4            | Range in years: `+/-5,883,517` around epoch. Maximum date January 1, 5885487 (approximately). Minimum value: `-2,147,483,648`; maximum value: `2,147,483,647`. Supported formats when using `COPY FROM`: `mm/dd/yyyy`, `dd-mmm-yy`, `yyyy-mm-dd`, `dd/mmm/yyyy`.                                                                                                                                               |
| `DATE ENCODING DAYS(16)`  | 2            | <p>Range in days: <code>-32,768</code> - <code>32,767</code><br>Range in years: <code>+/-90</code> around epoch, April 14, 1880 - September 9, 2059.<br>Minumum value: <code>-2,831,155,200</code>; maximum value: <code>2,831,068,800</code>.<br>Supported formats when using <code>COPY FROM</code>: <code>mm/dd/yyyy</code>, <code>dd-mmm-yy</code>, <code>yyyy-mm-dd</code>, <code>dd/mmm/yyyy</code>.</p> |
| `DATE ENCODING FIXED(32)` | 4            | In DDL statements defaults to `DATE ENCODING DAYS(16)`. Deprecated.                                                                                                                                                                                                                                                                                                                                            |
| `DATE ENCODING FIXED(16)` | 2            | In DDL statements defaults to `DATE ENCODING DAYS(16)`. Deprecated.                                                                                                                                                                                                                                                                                                                                            |
| `DECIMAL`                 | 2, 4, or 8   | <p>Takes precision and scale parameters: <code>DECIMAL(precision,scale)</code>.</p><p>Size depends on precision:</p><ul><li>Up to <code>4</code>: 2 bytes</li><li><code>5</code> to <code>9</code>: 4 bytes</li><li><code>10</code> to <code>18</code> (maximum): 8 bytes</li></ul><p>Scale must be less than precision.</p>                                                                                   |
| `DOUBLE`                  | 8            | Variable precision. Minimum value: `-1.79 x e^308`; maximum value: `1.79 x e^308`.                                                                                                                                                                                                                                                                                                                             |
| `FLOAT`                   | 4            | Variable precision. Minimum value: `-3.4 x e^38`; maximum value: `3.4 x e^38`.                                                                                                                                                                                                                                                                                                                                 |
| `INTEGER`                 | 4            | Minimum value: `-2,147,483,647`; maximum value: `2,147,483,647`.                                                                                                                                                                                                                                                                                                                                               |
| `SMALLINT`                | 2            | Minimum value: `-32,767`; maximum value: `32,767`.                                                                                                                                                                                                                                                                                                                                                             |
| `TEXT ENCODING DICT`      | 4            | Max cardinality 2 billion distinct string values                                                                                                                                                                                                                                                                                                                                                               |
| `TEXT ENCODING NONE`      | Variable     | Size of the string + 6 bytes                                                                                                                                                                                                                                                                                                                                                                                   |
| `TIME`                    | 8            | Minimum value: `00:00:00`; maximum value: `23:59:59`.                                                                                                                                                                                                                                                                                                                                                          |
| `TIMESTAMP`               | 8            | <p>Linux timestamp from <code>-30610224000</code> (<code>1/1/1000 00:00:00.000</code>) through <code>29379542399</code> (<code>12/31/2900 23:59:59.999</code>).</p><p>Can also be inserted and stored in human-readable format:</p><ul><li><code>YYYY-MM-DD HH:MM:SS</code></li><li><code>YYYY-MM-DDTHH:MM:SS</code> (The <code>T</code> is dropped when the field is populated.)</li></ul>                    |
| `TINYINT`                 | 1            | Minimum value: `-127`; maximum value: `127`.                                                                                                                                                                                                                                                                                                                                                                   |

**\*** In OmniSci release 4.4.0 and higher, you can use existing 8-byte `DATE` columns, but you can create only 4-byte `DATE` columns (default) and 2-byte `DATE` columns (see [`DATE ENCODING FIXED(16)`](https://docs.omnisci.com/latest/5_datatypes.html#date_encoding_16)).

For more information, see [Datatypes and Fixed Encoding](/v8.3.0/sql/data-definition-ddl/datatypes-and-fixed-encoding.md).

For geospatial datatypes, see [Geospatial Primitives](/v8.3.0/sql/data-definition-ddl/datatypes-and-fixed-encoding.md#geospatial-datatypes).

### **Examples**

Create a table named `tweets` and specify the columns, including type, in the table.

```sql
CREATE TABLE IF NOT EXISTS tweets (
   tweet_id BIGINT NOT NULL,
   tweet_time TIMESTAMP NOT NULL ENCODING FIXED(32),
   lat FLOAT,
   lon FLOAT,
   sender_id BIGINT NOT NULL,
   sender_name TEXT NOT NULL ENCODING DICT,
   location TEXT ENCODING  DICT,
   source TEXT ENCODING DICT,
   reply_to_user_id BIGINT,
   reply_to_tweet_id BIGINT,
   lang TEXT ENCODING  DICT,
   followers INT,
   followees INT,
   tweet_count INT,
   join_time TIMESTAMP ENCODING  FIXED(32),
   tweet_text TEXT,
   state TEXT ENCODING  DICT,
   county TEXT ENCODING DICT,
   place_name TEXT,
   state_abbr TEXT ENCODING DICT,
   county_state TEXT ENCODING DICT,
   origin TEXT ENCODING DICT,
   phone_numbers bigint);
```

Create a table named delta and assign a default value `San Francisco` to column city.

```sql
CREATE TABLE delta (
   id INTEGER NOT NULL, 
   name TEXT NOT NULL, 
   city TEXT NOT NULL DEFAULT 'San Francisco' ENCODING DICT(16));
```

{% hint style="info" %}
Default values currently have the following limitations:

* Only literals can be used for column DEFAULT values; expressions are not supported.
* You cannot define a DEFAULT value for a shard key. For example, the following does not parse: `CREATE TABLE tbl (id INTEGER NOT NULL DEFAULT 0, name TEXT, shard key (id)) with (shard_count = 2);`
* For arrays, use the following syntax: `ARRAY[A, B, C, …. N]`

  The syntax `{A, B, C, ... N}` is not supported.
* Some literals, like NUMERIC and GEO types, are not checked at parse time. As a result, you can define and create a table with malformed literal as a default value, but when you try to insert a row with a default value, it will throw an error.
  {% endhint %}

### Supported Encoding

| Encoding       | Descriptions                                                                                                                                                |
| -------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `DICT`         | Dictionary encoding on string columns (default for `TEXT` columns). Limit of 2 billion unique string values.                                                |
| `FIXED` (bits) | Fixed length encoding of integer or timestamp columns. See [Datatypes and Fixed Encoding](/v8.3.0/sql/data-definition-ddl/datatypes-and-fixed-encoding.md). |
| `NONE`         | No encoding. Valid only on `TEXT` columns. No Dictionary is created. Aggregate operations are not possible on this column type.                             |

### WITH Clause Properties

| Property              | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| --------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `fragment_size`       | Number of rows per fragment that is a unit of the table for query processing. Default: 32 million rows, which is not expected to be changed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `max_rollback_epochs` | <p>Limit the number of epochs a table can be rolled back to. Limiting the number of epochs helps to limit the amount of on-disk data and prevent unmanaged data growth.</p><p>Limiting the number of rollback epochs also can increase system startup speed, especially for systems on which data is added in small batches or singleton inserts. Default: 3.</p><p>The following example creates the table <code>test\_table</code> and sets the maximum epoch rollback number to 50<strong>:</strong></p><p><code>CREATE TABLE test\_table(a int) WITH (MAX\_ROLLBACK\_EPOCHS = 50);</code></p>                                                                                                                                                                                                  |
| `max_rows`            | <p>Used primarily for streaming datasets to limit the number of rows in a table, to avoid running out of memory or impeding performance. When the <code>max\_rows</code> limit is reached, the oldest fragment is removed. When populating a table from a file, make sure that your row count is below the <code>max\_rows</code> setting. If you attempt load more rows at one time than the <code>max\_rows</code> setting defines, the records up to the <code>max\_rows</code> limit are removed, leaving only the additional rows. Default: 2^62.<br><br>In a distributed system, the maximum number of rows is calculated as <code>max\_rows \* leaf\_count</code>. In a sharded distributed system, the maximum number of rows is calculated as <code>max\_rows \* shard\_count</code>.</p> |
| `page_size`           | Number of I/O page bytes. Default: 1MB, which does not need to be changed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| `partitions`          | <p>Partition strategy option:</p><ul><li><code>SHARDED</code>: Partition table using sharding.</li><li><code>REPLICATED</code>: Partition table using replication.</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| `shard_count`         | Number of shards to create, typically equal to the number of GPUs across which the data table is distributed.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| `sort_column`         | Name of the column on which to sort during bulk import.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |

### Sharding

Sharding partitions a database table across multiple servers so each server has a part of the table with the same columns but with different rows. Partitioning is based on a sharding key defined when you create the table.

Without sharding, the dimension tables involved in a join are replicated and sent to each GPU, which is not feasible for dimension tables with many rows. Specifying a shard key makes it possible for the query to execute efficiently on large dimension tables.

Currently, specifying a shard key is useful for joins, only:

* If two tables specify a shard key with the same type and the same number of shards, a join on that key only sends a part of the dimension table column data to each GPU.
* For multi-node installs, the dimension table does not need to be replicated and the join executes locally on each leaf.

#### **Constraints**

* A shard key must specify a single column to shard on. There is no support for sharding by a combination of keys.
* One shard key can be specified for a table.
* Data are partitioned according to the shard key and the number of shards (`shard_count`).
* A value in the column specified as a shard key is always sent to the same partition.
* The number of shards should be equal to the number of GPUs in the cluster.
* Sharding is allowed on the following column types:
  * DATE
  * INT
  * TEXT ENCODING DICT
  * TIME
  * TIMESTAMP
* Tables must share the dictionary for the column to be involved in sharded joins. If the dictionary is not specified as shared, the join does not take advantage of sharding. Dictionaries are reference-counted and only dropped when the last reference drops.

#### **Recommendations**

* Set `shard_count` to the number of GPUs you eventually want to distribute the data table across.
* Referenced tables must also be `shard_count` -aligned.
* Sharding should be minimized because it can introduce load skew accross resources, compared to when sharding is not used.

**Examples**

Basic sharding:

```sql
CREATE TABLE  customers(
   accountId text,
   name text,
   SHARD KEY (accountId))
  WITH (shard_count = 4);
```

Sharding with shared dictionary:

```sql
CREATE TABLE transactions(
   accountId text,
   action text,
   SHARD KEY (accountId),
   SHARED DICTIONARY (accountId) REFERENCES customers(accountId))
  WITH (shard_count = 4);
```

### Temporary Tables

Using the TEMPORARY argument creates a table that persists only while the server is live. They are useful for storing intermediate result sets that you access more than once.

{% hint style="info" %}
Adding or dropping a column from a temporary table is not supported.
{% endhint %}

#### **Example**

```sql
CREATE TEMPORARY TABLE customers(
   accountId TEXT,
   name TEXT,
   timeCreated TIMESTAMP)
```

## CREATE TABLE AS SELECT

```sql
CREATE TABLE [IF NOT EXISTS] <newTableName> AS (<SELECT statement>) [WITH (<property> = value, ...)];
```

Create a table with the specified columns, copying any data that meet SELECT statement criteria.

### WITH Clause Properties

| Property                  | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| ------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `fragment_size`           | Number of rows per fragment that is a unit of the table for query processing. Default = 32 million rows, which is not expected to be changed.                                                                                                                                                                                                                                                                                                                                               |
| `max_chunk_size`          | Size of chunk that is a unit of the table for query processing. Default: 1073741824 bytes (1 GB), which is not expected to be changed.                                                                                                                                                                                                                                                                                                                                                      |
| `max_rows`                | Used primarily for streaming datasets to limit the number of rows in a table. When the `max_rows` limit is reached, the oldest fragment is removed. When populating a table from a file, make sure that your row count is below the `max_rows` setting. If you attempt load more rows at one time than the `max_rows` setting defines, the records up to the `max_rows` limit are removed, leaving only the additional rows. Default = 2^62.                                                |
| `page_size`               | Number of I/O page bytes. Default = 1MB, which does not need to be changed.                                                                                                                                                                                                                                                                                                                                                                                                                 |
| `partitions`              | <p>Partition strategy option:</p><ul><li><code>SHARDED</code>: Partition table using sharding.</li><li><code>REPLICATED</code>: Partition table using replication.</li></ul>                                                                                                                                                                                                                                                                                                                |
| `use_shared_dictionaries` | <p>Controls whether the created table creates its own dictionaries for text columns, or instead shares the dictionaries of its source table. Uses shared dictionaries by default (<code>true</code>), which increases the speed of table creation.</p><p>Setting to false shrinks the dictionaries if SELECT for the created table has a narrow filter; for example:<br><br><code>CREATE TABLE new\_table AS SELECT \* FROM old\_table WITH (USE\_SHARED\_DICTIONARIES='false');</code></p> |
| `vacuum`                  | Formats the table to more efficiently handle `DELETE` requests. The only parameter available is `delayed`. Rather than immediately remove deleted rows, vacuum marks items to be deleted, and they are removed at an optimal time.                                                                                                                                                                                                                                                          |

### Examples

Create the table `newTable`. Populate the table with all information from the table `oldTable`, effectively creating a duplicate of the original table.

```sql
CREATE TABLE newTable AS (SELECT * FROM oldTable);
```

Create a table named `trousers`. Populate it with data from the columns `name`, `waist`, and `inseam` from the table `wardrobe`.

```sql
CREATE TABLE trousers AS (SELECT name, waist, inseam FROM wardrobe);
```

Create a table named `cosmos`. Populate it with data from the columns `star` and `planet` from the table universe where planet has the class M.

```sql
CREATE TABLE IF NOT EXISTS cosmos AS (SELECT star, planet FROM universe WHERE class='M');
```

## ALTER TABLE

```
ALTER TABLE <table> RENAME TO <table>;
ALTER TABLE <table> RENAME COLUMN <column> TO <column>;
ALTER TABLE <table> ADD [COLUMN] <column> <type> [NOT NULL] [ENCODING <encodingSpec>];
ALTER TABLE <table> ADD (<column> <type> [NOT NULL] [ENCODING <encodingSpec>], ...);
ALTER TABLE <table> ADD (<column> <type> DEFAULT <value>);
ALTER TABLE <table> DROP COLUMN <column_1>[, <column_2>, ...];
ALTER TABLE <table> SET MAX_ROLLBACK_EPOCHS=<value>;
ALTER TABLE <table> ALTER COLUMN <column> TYPE <type>, ALTER COLUMN <column> TYPE <type>, ...;
```

### Examples

Rename the table tweets to retweets.

```sql
ALTER TABLE tweets RENAME TO retweets;
```

Rename the column source to device in the table retweets.

```sql
ALTER TABLE retweets RENAME COLUMN source TO device;
```

Add the column pt\_dropoff to table tweets with a default value point(0,0).

```sql
ALTER TABLE tweets ADD COLUMN pt_dropoff POINT DEFAULT 'point(0 0)';
```

Add multiple columns a, b, and c to table table\_one with a default value of `15` for column b.

```sql
ALTER TABLE table_one ADD a INTEGER, b INTEGER NOT NULL DEFAULT 15, c TEXT;
```

{% hint style="info" %}
Default values currently have the following limitations:

* Only literals can be used for column DEFAULT values; expressions are not supported.
* For arrays, use the following syntax: `ARRAY[A, B, C, …. N]`. The syntax `{A, B, C, ... N}` is not supported.
* Some literals, like NUMERIC and GEO types, are not checked at parse time. As a result, you can define and create a table with a malformed literal as a default value, but when you try to insert a row with a default value, it throws an error.
  {% endhint %}

Add the column lang to the table tweets using a TEXT ENCODING DICTIONARY.

```sql
ALTER TABLE tweets ADD COLUMN lang TEXT ENCODING DICT;
```

Add the columns lang and encode to the table tweets using a TEXT ENCODING DICTIONARY for each.

```sql
ALTER TABLE tweets ADD (lang TEXT ENCODING DICT, encode TEXT ENCODING DICT);
```

Drop the column pt\_dropoff from table tweets.

```sql
ALTER TABLE tweets DROP COLUMN pt_dropoff;
```

Limit on-disk data growth by setting the number of allowed epoch rollbacks to 50:

```sql
ALTER TABLE test_table SET MAX_ROLLBACK_EPOCHS=50;
```

{% hint style="info" %}

* You cannot add a dictionary-encoded string column with a shared dictionary when using ALTER TABLE ADD COLUMN.
* Currently, HEAVY.AI does not support adding a geo column type (POINT, LINESTRING, POLYGON, or MULTIPOLYGON) to a table.
* HEAVY.AI supports ALTER TABLE RENAME TABLE and ALTER TABLE RENAME COLUMN for temporary tables. HEAVY.AI does not support ALTER TABLE ADD COLUMN to modify a temporary table.
  {% endhint %}

Change a text column “id” to an integer column:

```sql
ALTER TABLE my_table ALTER COLUMN id TYPE INTEGER;
```

Change text columns “id” and “location” to big integer and point columns respectively:

```sql
ALTER TABLE my_table ALTER COLUMN id TYPE BIGINT, ALTER COLUMN location TYPE GEOMETRY(POINT, 4326);
```

{% hint style="info" %}
Currently, only text column types (dictionary encoded and none encoded text columns) can be altered.
{% endhint %}

## DROP TABLE

```
DROP TABLE [IF EXISTS] <table>;
```

Deletes the table structure, all data from the table, and any dictionary content unless it is a shared dictionary. (See the Note regarding [disk space reclamation](https://docs.omnisci.com/latest/5_tables.html#delete-admonition).)

### Example

```sql
DROP TABLE IF EXISTS tweets;
```

## DUMP TABLE

```
DUMP TABLE <table> TO '<filepath>' [WITH (COMPRESSION='<compression_program>')];
```

Archives data and dictionary files of the table `<table>` to file `<filepath>`.

Valid values for `<compression_program>` include:

* gzip (default)
* pigz
* lz4
* none

If you do not choose a compression option, the system uses gzip if it is available. If gzip is not installed, the file is not compressed.

The file path must be enclosed in single quotes.

{% hint style="info" %}

* Dumping a table locks writes to that table. Concurrent reads are supported, but you cannot import to a table that is being dumped.
* The `DUMP` command is not supported on distributed configurations.
* You must have a least GRANT CREATE ON DATABASE privilege level to use the `DUMP` command.
  {% endhint %}

### Example

```sql
DUMP TABLE tweets TO '/opt/archive/tweetsBackup.gz' WITH (COMPRESSION='gzip');
```

## RENAME TABLE

```
RENAME TABLE <table> TO <table>[, <table> TO <table>, <table> TO <table>...];
```

Rename a table or multiple tables at once.

### Examples

Rename a single table:

```sql
RENAME TABLE table_A TO table_B;
```

Swap table names:

```sql
RENAME TABLE table_A TO table_B, table_B TO table_A;

RENAME TABLE table_A TO table_B, table_B TO table_C, table_C TO table_A;
```

Swap table names multiple times:

```sql
RENAME TABLE table_A TO table_A_stale, table_B TO table_B_stale, table_A_new TO table_A, table_B_new TO table_B;
```

## RESTORE TABLE

```
RESTORE TABLE <table> FROM '<filepath>' [WITH (COMPRESSION='<compression_program>')];
```

Restores data and dictionary files of table `<table>` from the file at `<filepath>`. If you specified a compression program when you used the `DUMP TABLE` command, you must specify the same compression method during `RESTORE`.

Restoring a table decompresses and then reimports the table. You must have enough disk space for both the new table and the archived table, as well as enough scratch space to decompress the archive and reimport it.

The file path must be enclosed in single quotes.

You can also restore a table from archives stored in S3-compatible endpoints:

```
RESTORE TABLE <table> FROM '<S3_file_URL>' 
  WITH (compression = '<compression_program>', 
        s3_region = '<region>', 
        s3_access_key = '<access_key>', 
        s3_secret_key = '<secret_key>', 
        s3_session_token = '<session_token>');
```

`s3_region` is required. All features discussed in [the S3 import documentation](/v8.3.0/loading-and-exporting-data/command-line/load-data.md), such as custom S3 endpoints and server privileges, are supported.

{% hint style="info" %}

* Restoring a table locks writes to that table. Concurrent reads are supported, but you cannot import to a table that is being restored.
* The `RESTORE` command is not supported on distributed configurations.
* You must have a least GRANT CREATE ON DATABASE privilege level to use the `RESTORE` command.
  {% endhint %}

{% hint style="warning" %}
Do not attempt to use RESTORE TABLE with a table dump created using a release of HEAVY.AI that is higher than the release running on the server where you will restore the table.
{% endhint %}

### Examples

Restore table `tweets` from /opt/archive/tweetsBackup.gz:

```sql
RESTORE TABLE tweets FROM '/opt/archive/tweetsBackup.gz' 
   WITH (COMPRESSION='gzip');
```

Restore table `tweets` from a public S3 file or using server privileges (with the `allow-s3-server-privileges` server flag enabled):

```sql
RESTORE TABLE tweets FROM 's3://my-s3-bucket/archive/tweetsBackup.gz'
   WITH (compression = 'gzip', 
      s3_region = 'us-east-1');
```

Restore table `tweets` from a private S3 file using AWS access keys:

```sql
RESTORE TABLE tweets FROM 's3://my-s3-bucket/archive/tweetsBackup.gz' 2 
   WITH (compression = 'gzip', 
      s3_region = 'us-east-1', 
      s3_access_key = 'xxxxxxxxxx', s3_secret_key = 'yyyyyyyyy');
```

Restore table `tweets` from a private S3 file using temporary AWS access keys/session token:

```sql
RESTORE TABLE tweets FROM 's3://my-s3-bucket/archive/tweetsBackup.gz' 
   WITH (compression = 'gzip', 
      s3_region = 'us-east-1', 
      s3_access_key = 'xxxxxxxxxx', s3_secret_key = 'yyyyyyyyy',
      s3_session_token = 'zzzzzzzz');
```

Restore table `tweets` from an S3-compatible endpoint:

```sql
RESTORE TABLE tweets FROM 's3://my-gcp-bucket/archive/tweetsBackup.gz' 2 
   WITH (compression = 'gzip', 
      s3_region = 'us-east-1', 
      s3_endpoint = 'storage.googleapis.com');
```

## TRUNCATE TABLE

```
TRUNCATE TABLE <table>;
```

Use the `TRUNCATE TABLE` statement to remove all rows from a table without deleting the table structure.

This releases table on-disk and memory storage and removes dictionary content unless it is a shared dictionary. (See the note regarding [disk space reclamation](https://docs.omnisci.com/latest/5_tables.html#delete-admonition).)

Removing rows is more efficient than using [DROP TABLE](https://docs.omnisci.com/latest/5_tables.html#tablesdroptable). Dropping followed by recreating the table invalidates dependent objects of the table requiring you to regrant object privileges. Truncating has none of these effects.

### Example

```sql
TRUNCATE TABLE tweets;
```

{% hint style="info" %}
When you DROP or TRUNCATE, the command returns almost immediately. The directories to be purged are marked with the suffix *\\*\_DELETE\_ME\_. The files are automatically removed asynchronously.

In practical terms, this means that you will not see a reduction in disk usage until the automatic task runs, which might not start for up to five minutes.

You might also see directory names appended with *\\*\_DELETE\_ME\_. You can ignore these, with the expectation that they will be deleted automatically over time.
{% endhint %}

## OPTIMIZE TABLE

```
OPTIMIZE TABLE [<table>] [WITH (VACUUM='true')]
```

Use this statement to remove rows from storage that have been marked as deleted via `DELETE` statements.

When run without the vacuum option, the column-level metadata is recomputed for each column in the specified table. HeavyDB makes heavy use of metadata to optimize query plans, so optimizing table metadata can increase query performance after metadata widening operations such as updates or deletes. If the configuration parameter `enable-auto-metadata-update` is not set, HeavyDB does not narrow metadata during an update or delete — metadata is only widened to cover a new range.

When run with the vacuum option, it removes any rows marked "deleted" from the data stored on disk. Vacuum is a checkpointing operation, so new copies of any vacuum records are deleted. Using OPTIMIZE with the VACUUM option compacts pages and deletes unused data files that have not been repopulated.

{% hint style="info" %}
Beginning with Release 5.6.0, OPTIMIZE should be used infrequently, because UPDATE, DELETE, and IMPORT queries manage space more effectively.
{% endhint %}

## VALIDATE

```
VALIDATE
```

Performs checks for negative and inconsistent epochs across table shards for single-node configurations.

If `VALIDATE` detects epoch-related issues, it returns a report similar to the following:

```
heavysql> validate;
Result

Negative epoch value found for table "my_table". Epoch: -1.
Epoch values for table "my_table_2" are inconsistent:
Table Id  Epoch     
========= ========= 
4         1         
5         2
```

If no issues are detected, it reports as follows:

```
Instance OK
```

## VALIDATE CLUSTER

```
VALIDATE CLUSTER [WITH (REPAIR_TYPE = ['NONE' | 'REMOVE'])];
```

Perform checks and report discovered issues on a running HEAVY.AI cluster. Compare metadata between the aggregator and leaves to verify that the logical components between the processes are identical.

`VALIDATE CLUSTER` also detects and reports issues related to table epochs. It reports when epochs are negative or when table epochs across leaf nodes or shards are inconsistent.

### Examples

If `VALIDATE CLUSTER` detects issues, it returns a report similar to the following:

```
mapd@thing3 ~]$ /mnt/gluster/dist_mapd/mapd-sw2/bin/mapdql -p HyperInteractive
User admin connected to database heavyai
heavysql> validate cluster;
Result
 Node          Table Count 
 ===========   =========== 
 Aggregator     1116
 Leaf 0         1114
 Leaf 1         1114
No matching table on Leaf 0 for Table cities_dtl_POINTS table id 56
No matching table on Leaf 1 for Table cities_dtl_POINTS table id 56
No matching table on Leaf 0 for Table cities_dtl table id 80
No matching table on Leaf 1 for Table cities_dtl table id 80
Table details don't match on Leaf 0 for Table view_geo table id 95
Table details don't match on Leaf 1 for Table view_geo table id 95
```

If no issues are detected, it will report as follows:

```
Cluster OK
```

You can include the `WITH(REPAIR_TYPE)` argument. `(REPAIR_TYPE='NONE')` is the same as running the command with no argument. `(REPAIR_TYPE='REMOVE')` removes any leaf objects that have issues. For example:

```sql
VALIDATE CLUSTER WITH (REPAIR_TYPE = 'REMOVE');
```

#### Epoch Issue Example

This example output from the `VALIDATE CLUSTER` command on a distributed setup shows epoch-related issues:

```
heavysql> validate cluster;
Result

Negative epoch value found for table "my_table". Epoch: -16777216.
Epoch values for table "my_table_2" are inconsistent:
Node      Table Id  Epoch     
========= ========= ========= 
Leaf 0    4         1         
Leaf 1    4         2
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.heavy.ai/v8.3.0/sql/data-definition-ddl/tables.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.