Skip to main content
Skip table of contents

OPT_SCAN_ONLY_THE_LATEST_BY_KEY

Target database: Snowflake, Amazon Redshift, Azure SQL Database, Azure Synapse SQL, Microsoft Fabric, Google BigQuery, Databricks SQL

OPT_SCAN_ONLY_THE_LATEST_BY_KEY is a load option that uses windowing function to scan only the latest version of the data from the source. This suits cases where source data contains duplicates and only the latest version is to be selected from the source entity to the target entity.

With OPT_SCAN_ONLY_THE_LATEST_BY_KEY = true source data will be partitioned by the hash key and sorted in descending order by the load time and only the latest of them are selected i.e. Agile Data Engine will use internally the latest logged DV_LOAD_TIME or SDT_STAGE_CREATE_TIME attribute types for the sorting (unless OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE is not set otherwise).


See also:


Usage

OPTION
OPT_SCAN_ONLY_THE_LATEST_BY_KEY: boolean

Default: false


Notes

OPT_SCAN_ONLY_THE_LATEST_BY_KEY can be used in table loads that have a single entity mapping (i.e. one source entity) and attributes with following technical attribute types defined:

  • Target entity has either DV_HASHKEY or DV_REFERENCING_HASHKEY defined

  • Source entity either DV_LOAD_TIME or SDT_STAGE_CREATE_TIME (for sorting) or custom technical attribute type defined by OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE load option


Examples

Using OPT_SCAN_ONLY_THE_LATEST_BY_KEY

Set only the latest data to be scanned by the key:

OPTION
OPT_SCAN_ONLY_THE_LATEST_BY_KEY: true

When the load is generated, Agile Data Engine will add ROW_NUMBER (or DENSE_RANK in case of multi-active satellite) windowing function to the source data select and accept only the latest ones forward.

Resulting DML generated by Agile Data Engine:

SQL
INSERT INTO
  dv.H_TRIP (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
      , ROW_NUMBER()
            OVER (
                PARTITION BY
                    <DV_HASHKEY formula>
                ORDER BY
                     <SDT_STAGE_CREATE_TIME field> DESC
            )
            AS
                latest_key_entry
    FROM
      stage.STG_TRIPDATA_YELLOW src_entity
    WHERE
      VENDORID = 1
  ) src
WHERE
  latest_key_entry = 1
  AND
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.H_TRIP trg
    WHERE
      trg.dv_id = src.dv_id
  );
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.