SORT_KEY

Target database: Snowflake, Amazon Redshift, Azure SQL Database, Azure Synapse SQL, Google BigQuery, Databricks SQL

Defining a SORT_KEY for an entity uses different features depending on the target database management system:

Target DMBS	Feature
Snowflake	CLUSTER BY
Amazon Redshift	COMPOUND SORTKEY
Azure SQL Database, Azure Synapse SQL	CLUSTERED INDEX
Google BigQuery	CLUSTER BY
Databricks SQL	CLUSTER BY

See also:

Usage

Pick one or several attributes to form the key and give a name to the key.

Notes

SORT_KEY can be used with tables and materialized views in Snowflake, Amazon Redshift and Azure SQL Database. Azure SQL Database supports clustered indexes also for views.

SORT_KEY can be used with tables in Azure Synapse SQL and Databricks SQL.

Check Export SQL/Entity SQL after defining a SORT_KEY to see generated DDL.

Examples

Snowflake CLUSTER BY

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP (
  ...
) CLUSTER BY (trip_wid);

Amazon Redshift COMPOUND SORTKEY

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP (
  ...
) COMPOUND SORTKEY (trip_wid);

Azure SQL Database CLUSTERED INDEX

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE [pub].[F_TRIP] (
  ...
);
CREATE CLUSTERED INDEX [SORTKEY_F_TRIP] ON [pub].[F_TRIP] ([trip_wid]);

Azure Synapse SQL CLUSTERED INDEX

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE [pub].[F_TRIP] (
  ...
) WITH (CLUSTERED INDEX ([trip_wid]));

Google BigQuery CLUSTER BY

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP (
  ...
) CLUSTER BY (trip_wid);

Databricks SQL CLUSTER BY

Please do pay attention to the text in Databricks SQL documentation:

CODE

Clustering is not compatible with partitioning or ZORDER, 
and requires that you use Databricks to manage all layout and 
optimization operations for data in your table.

Therefore in case both CLUSTERING and PARTITIONING is defined for Databricks SQL table, only CLUSTERING will be applied by Agile Data Engine.

Key definition:

DEFINITION

KEY NAME: SORTKEY_F_TRIP
KEY TYPE: SORT_KEY
ATTRIBUTE 1: trip_wid

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP (
  ...
) CLUSTER BY (trip_wid);