STORAGE.PARTITION_SCHEME

Target database: Azure Sql Database, Azure Synapse SQL, Google BigQuery, Databricks SQL

STORAGE.PARTITION_SCHEME option enables defining partitioning for an entity. Requires also PARTITION_KEY to be defined.

Target DMBS	Entity types	Feature
Azure SQL Database	TABLE	Partitions entity with partition scheme utilizing the defined PARTITION_KEY column
Azure Synapse SQL	TABLE	Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column
Google BigQuery	TABLE, MATERIALIZED_VIEW	Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column
Databricks SQL	TABLE, MATERIALIZED_VIEW	Partitions entity with SQL snippet defined in partition scheme by replacing the defined <entity_partition_key> with the PARTITION_KEY column

See also:

Usage

For Azure SQL Database define the name of the partition scheme defined in the database.

For Google BigQuery, Databricks SQL and Azure Synapse SQL define the SQL snippet defining the partitioning rule and refer to the partition key with syntax <entity_partition_key>.

Notes

Check Export SQL/Entity SQL after defining a PARTITION_KEY to see generated DDL.

Azure SQL Database

Can be used with tables and requires partition key and partition scheme to be manually inserted to the target database beforehand. Requires PARTITION_KEY to be defined.

Google BigQuery

Can be used with tables and materialized views and if needed (there is possibility to partition the entity over ingestion time, which does not require column definition), define the partitioning column with PARTITION_KEY.

Basic partitioning over TIMESTAMP or DATE column (generates day partitions) is supported with DISTRIBUTION_KEY and does not require this physical option to be defined.

Databricks SQL

Can be used with tables and materialized views and if needed, define the partitioning column with PARTITION_KEY.

Basic partitioning over set of columns is supported with DISTRIBUTION_KEY and does not require this physical option to be defined.

Azure Synapse SQL

Can be used with tables, define the partitioning column with PARTITION_KEY.

Partition scheme follows pattern:

CODE

partition_column_name RANGE [ LEFT | RIGHT ] FOR VALUES ( [ boundary_value [,...n] ] )

Fixed partition range values

Azure Synapse SQL partitioning supports only fixed range values when executing the table creation. Agile Data Engine does not currently support the required range alteration logic in case further ranges would appear in the future. For more information on partition range split and merge operations, review Partition range alteration (Microsoft Documentation)

Examples

Azure SQL Server partitioned table

Pre-step partition function and partition scheme defined in the Azure SQL Server DB:

CODE

CREATE PARTITION FUNCTION categoryPF (int)
AS RANGE LEFT FOR VALUES (200, 400, 600, 800);

CREATE PARTITION SCHEME categoryPartitioningPS
AS PARTITION categoryPF
ALL TO ('PRIMARY') ;

Key definition:

DEFINITION

KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION

Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: categoryPartitioningPS

DDL generated by Agile Data Engine:

SQL

CREATE TABLE [pub].[F_TRIP] (
  ...
) ON  category_partitioning_ps([category]));

Google BigQuery partitioned table

Key definition:

DEFINITION

KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION

Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: RANGE_BUCKET(<entity_partition_key>, GENERATE_ARRAY(0, 1000, 200))

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP
(
  ...
)
PARTITION BY RANGE_BUCKET(category, GENERATE_ARRAY(0, 1000, 200));

Azure Synapse SQL partitioned table

Key definition:

DEFINITION

KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION

Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: <entity_partition_key> RANGE FOR VALUES ( 200,400,600,800 )

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP
(
  ...
)
WITH (
    PARTITION ([category] RANGE FOR VALUES ( 200,400,600,800 ))
);

Databricks SQL partitioned table

Please do pay attention to the text in Databricks SQL documentation:

CODE

Clustering is not compatible with partitioning or ZORDER, 
and requires that you use Databricks to manage all layout and 
optimization operations for data in your table.

Therefore in case both CLUSTERING and PARTITIONING is defined for Databricks SQL table, only CLUSTERING will be applied by Agile Data Engine.

Key definition:

DEFINITION

KEY NAME: F_TRIP
KEY TYPE: PARTITION_KEY
ATTRIBUTE NAME: category

Physical option definition:

DEFINITION

Physical option name: STORAGE.PARTITION_SCHEME
Physical option value: (<entity_partition_key> STRING)

DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP
(
  ...
)
PARTITION BY (category STRING);