DISTRIBUTION_KEY
Target database: Amazon Redshift, Azure Synapse SQL, Google BigQuery
Defining a DISTRIBUTION_KEY for an entity uses different features depending on the target database management system:
Target DMBS | Feature |
---|---|
Amazon Redshift | KEY distribution |
Azure Synapse SQL | Hash distribution |
Google BigQuery | PARTITION BY |
See also:
Usage
Pick one or several attributes to form the key and give a name to the key.
Notes
DISTRIBUTION_KEY can be used with tables and materialized views in Amazon Redshift, Azure Synapse SQL and Google BigQuery.
In AMAZON REDSHIFT & AZURE SYNAPSE SQL, use with key distributed entities:
DISTRIBUTION STYLE: KEY
In Google BigQuery DISTRIBUTION_KEY is to be used with TIMESTAMP and DATE field based partitioning generating day partitions. For more complex partitioning see STORAGE.PARTITION_SCHEME and PARTITION_KEY.
Check Export SQL/Entity SQL after defining a DISTRIBUTION_KEY to see generated DDL.
Examples
Amazon Redshift key distributed table
Entity distribution style:
DISTRIBUTION STYLE: KEY
Key definition:
KEY NAME: DISTKEY_F_TRIP
KEY TYPE: DISTRIBUTION_KEY
ATTRIBUTE 1: vendorid
DDL generated by Agile Data Engine:
CREATE TABLE pub.F_TRIP (
...
) DISTSTYLE KEY DISTKEY (vendorid);
Azure Synapse SQL hash distributed table
Entity distribution style:
DISTRIBUTION STYLE: KEY
Key definition:
KEY NAME: DISTKEY_F_TRIP
KEY TYPE: DISTRIBUTION_KEY
ATTRIBUTE 1: vendorid
DDL generated by Agile Data Engine:
CREATE TABLE [pub].[F_TRIP] (
...
) WITH (DISTRIBUTION = HASH ([vendorid]));
Google BigQuery partitioned table
Key definition:
KEY NAME: DISTKEY_F_TRIP
KEY TYPE: DISTRIBUTION_KEY
ATTRIBUTE 1: lpep_pickup_datetime
DDL generated by Agile Data Engine:
CREATE TABLE pub.F_TRIP
(
...
)
PARTITION BY DATE(lpep_pickup_datetime);