OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE
Target database: Snowflake, Amazon Redshift, Azure SQL Database, Azure Synapse SQL, Microsoft Fabric, Google BigQuery, Databricks SQL
OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE is a load option that specifies the technical attribute type which is to be used as the sorting attribute type when OPT_SCAN_ONLY_THE_LATEST_BY_KEY is enabled.
With OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE = {technical attribute type chosen}, source data will be partitioned by the hash key and sorted in descending order by the {technical attribute type chosen}.
See also:
Usage
OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE: string
Default: none
Notes
OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE is designed to be used only along with the option OPT_SCAN_ONLY_THE_LATEST_BY_KEY and only in cases where the default source entity field for sorting the data can not be used or is not available. When both of these load options are defined, the data needs to contain following attribute types:
Target entity has either DV_HASHKEY or DV_REFERENCING_HASHKEY defined
Source entity needs to contain technical attribute type defined in OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE load option
Examples
Using OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE
Set only the latest data to be scanned by the key:
OPT_SCAN_ONLY_THE_LATEST_BY_KEY: true
Choose a suitable attribute type or create a new one through CONFIG_ENTITY_DEFAULTS - package to be given for the source entity attribute, f.g. SRC_ENTRY_TIME and set it for the source entity attribute.
Set the source attribute type for the scan only latest feature:
OPT_SCAN_ONLY_THE_LATEST_WITH_SOURCE_ATTRIBUTE_TYPE: SRC_ENTRY_TIME
When the load is generated, Agile Data Engine will add ROW_NUMBER (or DENSE_RANK in case of multi-active satellite) windowing function to the source data select and accept only the latest ones forward.
Resulting DML generated by Agile Data Engine:
INSERT INTO
dv.H_TRIP (
...
)
SELECT
DISTINCT ...
FROM
(
SELECT
...
, ROW_NUMBER()
OVER (
PARTITION BY
<DV_HASHKEY formula>
ORDER BY
<SRC_ENTRY_TIME field> DESC
)
AS
latest_key_entry
FROM
stage.STG_TRIPDATA_YELLOW src_entity
WHERE
VENDORID = 1
) src
WHERE
latest_key_entry = 1
AND
NOT EXISTS (
SELECT
1
FROM
dv.H_TRIP trg
WHERE
trg.dv_id = src.dv_id
);