Skip to main content
Skip table of contents

PARTITION_KEY

Target database: Azure SQL Database, Azure Synapse SQL, Google BigQuery, Databricks SQL


Used to define the partitioning column for the partition scheme defined by physical option STORAGE.PARTITION_SCHEME.

Defining a PARTITION_KEY for an entity uses different features depending on the target database management system:

Target DMBS

Entity types

Feature

Azure SQL Database

TABLE

Partitions entity with partition scheme utilizing the defined PARTITION_KEY column

Google BigQuery

TABLE, MATERIALIZED_VIEW

Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column

Azure Synapse SQL

TABLE

Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column

Databricks SQL

TABLE,MATERIALIZED_VIEW

Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column


See also:


Usage

Pick one attribute to act as the partitioning column on the table or materialized view.


Notes

PARTITION_KEY can be used with tables in Azure SQL Database and requires partition key and partition scheme to be manually inserted to the Azure SQL Database beforehand.

PARTITION_KEY can be used with tables and materialized views in Google BigQuery and requires the partition scheme to be defined for the entity. Use DISTRIBUTION_KEY in case entity is to be partitioned over TIMESTAMP or DATE column.

PARTITION_KEY can be used with tables and materialized views in Databricks and requires the partition scheme to be defined for the entity. Use DISTRIBUTION_KEY for simple partitioning cases.

PARTITION_KEY can be used with tables in Azure Synapse SQL and requires the partition scheme to be defined for the entity.

Check Export SQL/Entity SQL after defining a PARTITION_KEY to see generated DDL.


Examples

Azure SQL Database partitioned table

Pre-step partition function and partition scheme defined in the Azure SQL Database:

CODE
CREATE PARTITION FUNCTION categoryPF (int)
AS RANGE LEFT FOR VALUES (200, 400, 600, 800);

CREATE PARTITION SCHEME categoryPartitioningPS
AS PARTITION categoryPF
ALL TO ('PRIMARY') ;

Key definition:

DEFINITION
KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: categoryPartitioningPS

DDL generated by Agile Data Engine:

SQL
CREATE TABLE [pub].[F_TRIP] (
  ...
) ON  category_partitioning_ps([category]));

Google BigQuery partitioned table

Key definition:

DEFINITION
KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: RANGE_BUCKET(<entity_partition_key>, GENERATE_ARRAY(0, 1000, 200))

DDL generated by Agile Data Engine:

SQL
CREATE TABLE pub.F_TRIP
(
  ...
)
PARTITION BY RANGE_BUCKET(category, GENERATE_ARRAY(0, 1000, 200));

Notes

The physical option depends on the data type.

Below are the physical option definitions with integer data type. The first physical option can be used when the domain of the attribute can naturally be divided into a few large segments where as the second option divides the values into as many partitions as there are values.

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: RANGE_BUCKET(<entity_partition_key>, GENERATE_ARRAY(0, 1000, 200))
OR
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: <entity_partition_key>

Physical option definitions with timestamp data type:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: DATE(<entity_partition_key>)

Physical option definitions with date data type:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: <entity_partition_key>

Azure Synapse SQL partitioned table

Key definition:

DEFINITION
KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: <entity_partition_key> RANGE FOR VALUES ( 200,400,600,800 )

DDL generated by Agile Data Engine:

SQL
CREATE TABLE pub.F_TRIP
(
  ...
)
WITH (
    PARTITION ([category] RANGE FOR VALUES ( 200,400,600,800 ))
);

Databricks SQL partitioned table

Please do pay attention to the text in Databricks SQL documentation:

CODE
Clustering is not compatible with partitioning or ZORDER, 
and requires that you use Databricks to manage all layout and 
optimization operations for data in your table.

Therefore in case both CLUSTERING and PARTITIONING is defined for Databricks SQL table, only CLUSTERING will be applied by Agile Data Engine.

Key definition:

DEFINITION
KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION
Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: (<entity_partition_key> STRING)

DDL generated by Agile Data Engine:

SQL
CREATE TABLE pub.F_TRIP
(
  ...
)
PARTITION BY (category STRING);
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.