Skip to main content
Skip table of contents

OPT_INSERT_UPDATE_BY_DATAHASH_AS_OVERWRITE

Target database: Snowflake, Databricks SQL

OPT_INSERT_UPDATE_BY_DATAHASH_AS_OVERWRITE is a load option that enables 1-1 loading with key and datahash


See also:


Usage

OPTION
OPT_INSERT_UPDATE_BY_DATAHASH_AS_OVERWRITE: Boolean

Default: false


Notes

OPT_INSERT_UPDATE_BY_DATAHASH_AS_OVERWRITE can be used in table loads that have a single entity mapping (i.e. one source entity) and attributes with following attribute types defined:

  • DV_LOAD_TIME

  • DV_DATAHASH

  • either DV_HASHKEY or DV_REFERENCING_HASHKEY

  • DV_UPDATE_TIME (optional)

Hint: Can be used also with Run ID logic.


Examples

Insert, Update Load by DATAHASH as OVERWRITE option

This example presents how an entity load would work with this logic enabled.

Assumptions:

  • Data is staged to table stage.stg_taxi_zone_lookup and it will be loaded to tgt.latest_entries

Feature is enabled for the load by setting option:

OPTION
OPT_INSERT_UPDATE_BY_DATAHASH_AS_OVERWRITE: true

Resulting DML generated by Agile Data Engine:

SQL
INSERT OVERWRITE INTO tgt.LATEST_ENTRIES
-- collecting data from target which do not exist in source by key
SELECT
  ....
FROM
  tgt.LATEST_ENTRIES trg
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      (
        SELECT
          DISTINCT <key formula> AS latest_entries_key
        FROM
          stage.STG_TAXI_ZONE_LOOKUP src_entity
      ) src
    WHERE
      trg.latest_entries_key = src.latest_entries_key
  )
UNION ALL
-- Data from target with unchanged datahash
SELECT
  ...
FROM
  tgt.LATEST_ENTRIES trg
WHERE
  EXISTS (
    SELECT
      1
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src
    WHERE
      trg.datahash = src.<key formula>
      )
  )
UNION ALL
-- Data from source with changed datahash
SELECT
  DISTINCT 
  ...
FROM
  (
    SELECT
    <key formula> AS latest_entries_key
      ...
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src_entity
  ) src
   -- LEFT OUTER JOIN is used if both DV_LOAD_TIME and DV_UPDATE_TIME existing (to store the initial insertion time)
  LEFT OUTER JOIN tgt.LATEST_ENTRIES trg on (
    trg.latest_entries_key = src.latest_entries_key
  )
  WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      tgt.LATEST_ENTRIES trg
    WHERE
      trg.datahash = src.datahash
  );
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.