Skip to main content
Skip table of contents

OPT_NUM_RUN_IDS_PER_TRANSACTION

OPT_NUM_RUN_IDS_PER_TRANSACTION is a load option that controls how many run IDs are loaded at a time in a database transaction when Run ID Logic is enabled for the load.


See also:


Usage

OPTION
OPT_NUM_RUN_IDS_PER_TRANSACTION: integer

Default: 100


Notes

OPT_NUM_RUN_IDS_PER_TRANSACTION can be used in table loads that have a single entity mapping (i.e. one source entity).

Use with:

OPTION
OPT_USE_RUN_IDS: true


OPT_NUM_RUN_IDS_PER_TRANSACTION can also be used to start tracking run ids from any load by setting value -1, see example here.


Examples

Loading a satellite one batch at a time

Satellites record the change history of the descriptive attributes of a business key. It is important that the history is stored in the correct order into the satellite so that it is known what the status was at any given point in time. A satellite load compares the existing status (current view) to the incoming data and inserts new and changed information per business key. If multiple batches are queued at staging to be loaded into a satellite, and the batches contain changed data of the same business keys, loading all batches at once will produce multiple rows into the satellite per business key with the same timestamp. As a result, the default current view will no longer function properly.

Run ID Logic can be used to resolve this problem.

Assumptions:

  • Data is staged to table stage.stg_taxi_zone_lookup and it will be loaded to rdv.s_taxi_zone

  • stage.stg_taxi_zone_lookup is not truncated between loads

  • Millisecond timestamps are used as unique batch ID for incoming data

  • Batch IDs 1610300000000, 1610200000000 and 1610100000000 are new queued batches that have not yet been loaded into the satellite

  • stage.stg_taxi_zone_lookup has attribute SDT_STAGE_BATCH_ID which is configured as technical attribute type RUN_ID.

Sample data in staging table ordered in descending order by batch ID:

SDT_STAGE_BATCH_ID

LOCATIONID

BOROUGH

ZONE

SERVICE_ZONE

1610300000000

1

EWR

Newark Airport

EWR

1610300000000

2

Queens

Jamaica Bay

Boro Zone

1610300000000

...

...

...

...

1610300000000

99

Staten Island

Freshkills Park

Boro Zone

1610200000000

1

EWR

Newark Apt

EWR

1610200000000

2

Queens

Jamaica Bay

Boro Zone

1610200000000

...

...

...

...

1610200000000

99

Staten Island

Freshkills Park

Boro Zone

1610100000000

1

EWR

EWR

EWR

1610100000000

2

Queens

Jamaica Bay

Boro Zone

1610100000000

...

...

...

...

1610100000000

99

Staten Island

Freshkills Park

Boro Zone

...

...

...

...

...

Run ID Logic is enabled for the load:

OPTION
OPT_USE_RUN_IDS: true

Run IDs per transaction is set to 1:

OPTION
OPT_NUM_RUN_IDS_PER_TRANSACTION: 1

Agile Data Engine will execute the load one batch at a time starting from the lowest batch ID:

SQL
INSERT INTO
  dv.S_TAXI_ZONE (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
      1234 AS dv_run_id,
      ...
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src_entity
    WHERE
      sdt_stage_batch_id IN (1610100000000)
  ) src
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.S_TAXI_ZONE_C trg
    WHERE
      trg.dv_datahash = src.dv_datahash
  );

Then:

SQL
INSERT INTO
  dv.S_TAXI_ZONE (
    ...
  )
SELECT
  DISTINCT ...
FROM
  (
    SELECT
      ...
      1235 AS dv_run_id,
      ...
    FROM
      stage.STG_TAXI_ZONE_LOOKUP src_entity
    WHERE
      sdt_stage_batch_id IN (1610200000000)
  ) src
WHERE
  NOT EXISTS (
    SELECT
      1
    FROM
      dv.S_TAXI_ZONE_C trg
    WHERE
      trg.dv_datahash = src.dv_datahash
  );

etc. until the latest queued batch has been loaded.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.