Skip to main content
Skip table of contents

OPT_USE_RUN_IDS

OPT_USE_RUN_IDS is a load option that enables the Run ID Logic for a load. When Run ID Logic is enabled, Agile Data Engine will load only new data batches by keeping track of loaded run ids between a source and a target entity and filtering loads with the SDT_STAGE_BATCH_ID and DV_RUN_ID attributes.


See also:


Usage

OPTION
OPT_USE_RUN_IDS: Boolean

Default: false


Notes

OPT_USE_RUN_IDS can be used in table loads that have a single entity mapping (i.e. one source entity).

Following Run ID Logic related variables can be used with OVERRIDE load steps:

Variable

Description

<sourcerunidattr>

Run ID attribute name in data source

<targetrunidattr>

Run ID attribute name in target table

<loadablerunids>

Comma-separated list of loadable run IDs (e.g. 23,24,55)

<targetrunid>

New target run id


Examples

Load satellite with Run ID Logic

This example presents how a satellite load would work with Run ID Logic enabled.

Assumptions:

  • Data is staged to table stage.stg_taxi_zone_lookup and it will be loaded to rdv.s_taxi_zone

  • stage.stg_taxi_zone_lookup is not truncated between loads

  • Millisecond timestamps are used as unique batch ID for incoming data

  • Batch ID = 1610300000000 is a new queued batch that has not yet been loaded into the satellite

  • stage.stg_taxi_zone_lookup has attribute SDT_STAGE_BATCH_ID which is configured as technical attribute type RUN_ID.

Sample data in staging table ordered in descending order by batch ID:

SDT_STAGE_BATCH_ID

LOCATIONID

BOROUGH

ZONE

SERVICE_ZONE

1610300000000

1

EWR

Newark Airport

EWR

1610300000000

2

Queens

Jamaica Bay

Boro Zone

1610300000000

...

...

...

...

1610300000000

99

Staten Island

Freshkills Park

Boro Zone

1610200000000

1

EWR

EWR

EWR

1610200000000

2

Queens

Jamaica Bay

Boro Zone

1610200000000

...

...

...

...

1610200000000

99

Staten Island

Freshkills Park

Boro Zone

1610100000000

1

EWR

EWR

EWR

1610100000000

2

Queens

Jamaica Bay

Boro Zone

1610100000000

...

...

...

...

1610100000000

99

Staten Island

Freshkills Park

Boro Zone

...

...

...

...

...

Run ID Logic is enabled for the load:

OPTION
OPT_USE_RUN_IDS: true

Export SQL generated by Agile Data Engine:

SQL
INSERT INTO
  dv.S_TAXI_ZONE (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
      <targetrunid> AS dv_run_id,
      ...
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src_entity
    WHERE
      <sourcerunidattr> IN (<loadablerunids>)
  ) src
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.S_TAXI_ZONE_C trg
    WHERE
      trg.dv_datahash = src.dv_datahash
  );

Note that Agile Data Engine will place a WHERE clause into the SELECT statement from the staging table where it filters the source batch id attribute by loadable run IDs (queued batches). Agile Data Engine will also write a new target batch id into the DV_RUN_ID attribute in the target satellite table. This can be used for further loads using Run ID Logic from the satellite table.

Let us assume that this is the 1234th load between the staging table and satellite table. The DML will be resolved at runtime as:

SQL
INSERT INTO
  dv.S_TAXI_ZONE (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
      1234 AS dv_run_id,
      ...
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src_entity
    WHERE
      sdt_stage_batch_id IN (1610300000000)
  ) src
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.S_TAXI_ZONE_C trg
    WHERE
      trg.dv_datahash = src.dv_datahash
  );
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.