Skip to main content
Skip table of contents

OPT_ORDER_BY

OPT_ORDER_BY is a load option that enables partial sorting of a table while loading in automatically generated loads.


See also:


Usage

OPTION
OPT_ORDER_BY: sql_code

Notes

OPT_ORDER_BY can be used with loads that have a single entity mapping (i.e. one source entity) and no OVERRIDE load steps. OPT_ORDER_BY will not have an effect if the load is overridden with an OVERRIDE load step.

Does not work with Synapse Multi-Active Satellite as the Datahash logic has been implemented with subquery.


Examples

Order data by timestamp when inserting into a satellite

OPT_ORDER_BY set to order the inserted data by attribute LPEP_PICKUP_DATETIME in ascending order:

OPTION
OPT_ORDER_BY: LPEP_PICKUP_DATETIME ASC

Resulting DML generated by Agile Data Engine:

SQL
INSERT INTO
  dv.S_TRIP_GREEN (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
    FROM
      stage.STG_TRIPDATA_GREEN src_entity
  ) src
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.S_TRIP_GREEN_C trg
    WHERE
      trg.dv_datahash = src.dv_datahash
  )
ORDER BY
  LPEP_PICKUP_DATETIME ASC;

Snowflake: Benefit from automatic micro-partitioning
Ordering inserted data by carefully chosen attributes can greatly improve query performance in Snowflake due to automatic micro-partitioning. For example, if the load is incremental like in this satellite example, an obvious choice would be a timestamp column that might often be used in queries to filter the target entity, and that already increments naturally over batches of incoming data.

More details about micro-partitions and data clustering in Snowflake documentation.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.