OPT_ORDER_BY is a load option that enables partial sorting of a table while loading in automatically generated loads.
OPT_ORDER_BY can be used with loads that have a single entity mapping (i.e. one source entity) and no OVERRIDE load steps. OPT_ORDER_BY will not have an effect if the load is overridden with an OVERRIDE load step.
Does not work with Synapse Multi-Active Satellite as the Datahash logic has been implemented with subquery.
Order data by timestamp when inserting into a satellite
OPT_ORDER_BY set to order the inserted data by attribute LPEP_PICKUP_DATETIME in ascending order:
OPT_ORDER_BY: LPEP_PICKUP_DATETIME ASC
Resulting DML generated by Agile Data Engine:
NOT EXISTS (
trg.dv_datahash = src.dv_datahash
Snowflake: Benefit from automatic micro-partitioning
Ordering inserted data by carefully chosen attributes can greatly improve query performance in Snowflake due to automatic micro-partitioning. For example, if the load is incremental like in this satellite example, an obvious choice would be a timestamp column that might often be used in queries to filter the target entity, and that already increments naturally over batches of incoming data.
More details about micro-partitions and data clustering in Snowflake documentation.