OPT_SCAN_ONLY_THE_LATEST_BY_KEY
Target database: Snowflake, Amazon Redshift, Azure SQL Database, Azure Synapse SQL, Microsoft Fabric, Google BigQuery, Databricks SQL
OPT_SCAN_ONLY_THE_LATEST_BY_KEY is a load option that uses windowing function to scan only the latest version of the data from the source. This suits cases where source data contains duplicates and only the latest version is to be selected from the source entity to the target entity.
With OPT_SCAN_ONLY_THE_LATEST_BY_KEY = true source data will be partitioned by the hash key and sorted in descending order by the load time and only the latest of them are selected i.e. Agile Data Engine will use internally the latest logged DV_LOAD_TIME or SDT_STAGE_CREATE_TIME attribute types for the sorting (unless OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE is not set otherwise).
See also:
Usage
OPT_SCAN_ONLY_THE_LATEST_BY_KEY: boolean
Default: false
Notes
OPT_SCAN_ONLY_THE_LATEST_BY_KEY can be used in table loads that have a single entity mapping (i.e. one source entity) and attributes with following technical attribute types defined:
Target entity has either DV_HASHKEY or DV_REFERENCING_HASHKEY defined
Source entity either DV_LOAD_TIME or SDT_STAGE_CREATE_TIME (for sorting) or custom technical attribute type defined by OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE load option
Examples
Using OPT_SCAN_ONLY_THE_LATEST_BY_KEY
Set only the latest data to be scanned by the key:
OPT_SCAN_ONLY_THE_LATEST_BY_KEY: true
When the load is generated, Agile Data Engine will add ROW_NUMBER (or DENSE_RANK in case of multi-active satellite) windowing function to the source data select and accept only the latest ones forward.
Resulting DML generated by Agile Data Engine:
INSERT INTO
dv.H_TRIP (
...
)
SELECT
DISTINCT ...
FROM
(
SELECT
...
, ROW_NUMBER()
OVER (
PARTITION BY
<DV_HASHKEY formula>
ORDER BY
<SDT_STAGE_CREATE_TIME field> DESC
)
AS
latest_key_entry
FROM
stage.STG_TRIPDATA_YELLOW src_entity
WHERE
VENDORID = 1
) src
WHERE
latest_key_entry = 1
AND
NOT EXISTS (
SELECT
1
FROM
dv.H_TRIP trg
WHERE
trg.dv_id = src.dv_id
);