Skip to main content
Skip table of contents

Status satellite

Status satellites, also known as status tracking satellites or effectivity satellites, are used to track the validity of records in either hub or link tables.

Creating a status satellite

Follow these steps to create a status satellite:

  1. Select or create a package and create a new entity, choose type S_SAT and physical type TABLE.

  2. Navigate to the source entity (staging table) and create an outgoing load selecting the status satellite as target entity.

  3. Create an entity mapping as follows:

Source attribute

Transformation

Target attribute

Notes

Business key attribute(s) in source entity.

HASH

dv_id

Map the source attribute(s) for the dv_id of status satellite. This mapping should match the key of the related link or hub.

Note that if you add a foreign key reference from the status satellite dv_id to the link or hub dv_id beforehand, you can use the same mapping as in the primary key automatically by selecting Copy by Reference.

-

CURRENT_TS

dv_load_time

Load timestamp, does not need any modifications by default.

-

RUN_ID

dv_run_id

Run ID (see Using Run ID Logic), does not need any modifications by default.

Source name attribute in source entity.

NONE / CUSTOM

dv_source

Best practice is to have the source name as a metadata attribute on staging entities and to map it to Data Vault entities without any transformations.

Alternatively, a CUSTOM transformation can be used e.g. with a source name or an environment variable given in single quotes. However, it is best to avoid the CUSTOM transformation type if possible.

Business key attribute(s) in source entity.

HASH

dv_driving_key

Driving key of related link or hub key if creating a hub status satellite.

See examples for more details on choosing the driving key.

-

DV_STATUS

status

Status of the record, does not need any modifications by default.

4. By default, Agile Data Engine assumes that source data is received as delta extracts (OPT_DELTA_EXTRACT = true). If you are receiving a full extract instead, set load option OPT_DELTA_EXTRACT = false. Note that a hub status satellite can only be loaded with a full extract using the default status satellite logic. Implement custom logic if your source can provide hub status information as a delta extract.

Use the Export SQL feature to preview entity SQL & load SQL. Note that Agile Data Engine will automatically create current views (entity type S_SAT_C) for status satellites.

Naming conventions

Note that you can modify the default entity and attribute naming used in the examples in the CONFIG_ENTITY_DEFAULTS configuration package to fit your naming conventions. See more in Customizing default attributes and transformations

Examples

Status satellite for a link with delta data

With a status satellite, we can track the validity history of records on a link table. In this example case, we extract delta data on a daily basis from a source system and need to track the validity of the relationship between store_id and sales_person_id.

Source data

Consider the following example source data in the staging table, loaded on 2023-05-25:

store_id

store_name

sales_person_id

modifieddate

stg_batch_id

444

Fourth Bike Store

275

2023-05-24 11:15:07.497

1685014636041

798

Sales and Supply Company

275

2023-05-24 11:15:07.497

1685014636041

916

Blue Bicycle Company

275

2023-05-24 11:15:07.497

1685014636041

Each store has a responsible sales person (sales_person_id = 275) attached to it. Source data is loaded from the source system daily as a delta load according to modifieddate.

What happens if sales person is allocated to different store? How does our data warehouse track this change?

New source data is loaded on 2023-05-28 and data has been changed. Store_id 444 now contains sales_person_id 287:

store_id

store_name

sales_person_id

modifieddate

stg_batch_id

444

Fourth Bike Store

287

2023-05-27 11:15:07.497

1685257212000

See below how this change can be recorded with a link and a status satellite.

The source data has been modeled into the following link table:

Source attribute

Transformation

Target attribute

Notes

store_id &
sales_person_id

HASH

dv_id

Store id and sales person id will form the primary key for the link.

-

CURRENT_TS

dv_load_time

Load timestamp

-

RUN_ID

dv_run_id

Run ID (see Using Run ID Logic)

Source name attribute in source entity.

NONE / CUSTOM

dv_source

Best practice is to have the source name as a metadata attribute on staging entities and to map it to Data Vault entities without any transformations.

Alternatively, a CUSTOM transformation can be used e.g. with a source name or an environment variable given in single quotes. However, it is best to avoid the CUSTOM transformation type if possible.

store_id

HASH

dv_id_store

Reference to the store hub

sales_person_id

HASH

dv_id_sales_person

Reference to the sales person hub

Status satellite

To create a status satellite for the link, you need to define the following:

  • What is the driving key for the status satellite? The granularity of the data is store and it contains store’s salesperson, thus the driving key for the status satellite is store.

  • In other words, we want to record who is responsible for a store at any given moment.

With a status satellite we can track this change history and also query the most recent data. Status satellite definition in Agile Data Engine:

Source attribute

Transformation

Target attribute

Notes

store_id &

sales_person_id

HASH

dv_id

Use the same mapping as in the link dv_id.

-

CURRENT_TS

dv_load_time

Load timestamp, does not need any modifications by default.

-

RUN_ID

dv_run_id

Run ID (see Using Run ID Logic), does not need any modifications by default.

Source name attribute in source entity.

NONE / CUSTOM

dv_source

Best practice is to have the source name as a metadata attribute on staging entities and to map it to Data Vault entities without any transformations.

Alternatively, a CUSTOM transformation can be used e.g. with a source name or an environment variable given in single quotes. However, it is best to avoid the CUSTOM transformation type if possible.

store_id

HASH

dv_driving_key

Driving key of the link table

-

DV_STATUS

status

Status of the record

By default, load option OPT_DELTA_EXTRACT is set as true. However, you can explicitly set it to improve the understandability of the load definition.

Store and sales person values displayed in the examples for simplicity, but those fields were not actually loaded to the status satellite.

The status satellite load on 2023-05-25 would result in the following row for store 444:

dv_id

status

dv_load_time

store

sales person

65ea6441a1ffe53f0565417a97568310

1

2023-05-25 12:32:56.152

444

275

Loading the status satellite again on 2023-05-28 where the sales person for store 444 has changed to 287 would add two new rows:

dv_id

status

dv_load_time

store

sales person

65ea6441a1ffe53f0565417a97568310

1

2023-05-25 12:32:56.152

444

275

65ea6441a1ffe53f0565417a97568310

0

2023-05-28 10:36:46.580

444

275

6995667b118ede55d71d4cd7451345db

1

2023-05-28 10:36:47.170

444

287

When querying the data, we can utilize the automatically created current view for this status satellite. The query for finding out who is the current sales person for store 444:

SQL
select 
    lss.dv_id,
    hst.business_key as store_key,
    hs.business_key as sales_person_key,
    lssc.status,
    lssc.dv_load_time
from rdv.l_store_salesperson lss
join rdv.h_salesperson hs on lss.dv_id_sales_person = hs.dv_id
join rdv.h_store hst on lss.dv_id_store = hst.dv_id
join rdv.ss_store_salesperson_c lssc
    on lss.dv_id = lssc.dv_id
where hst.business_key = 444;

As a result, the query will return the most recent data about the store and its responsible sales person:

dv_id

store_key

sales_person_key

status

dv_load_time

6995667b118ede55d71d4cd7451345db

444

287

1

2023-05-28 10:36:47.170

In this case, we needed to track the validity of the relationship between store_id and sales_person_id. This was achieved with creating a link table and associating a status satellite to it. The driving key for the status satellite was store_id.

Status satellite load with scanning latest rows from staging by driving key

If staged source data contains multiple rows over dv_driving_key, then a regular status satellite load will result in duplicates over dv_driving_key and dv_load_time (compare to this satellite load example). If there are duplicates due to there being multiple source data batches staged at the same time, see this satellite Run ID load example. If you are not using Run ID Logic, if there are duplicates even within one batch of source data, or if for performance reasons you only wish to persist the latest status per load execution, you can follow this example to pick the latest row over dv_driving_key.

Notes:

  • OPT_SCAN_ONLY_THE_LATEST_BY_KEY cannot be used with status satellites as it scans the incoming data by dv_id. Here we need to use dv_driving_key instead.

  • The load pattern in this example discards all but the latest version over dv_driving_key per load execution.

  • This pattern is performant also for initial loads where only the latest status is initially required.

  • This pattern requires QUALIFY clauses to be supported by the target database, syntax may vary.

Configure a status satellite load as usual, add the following OPT_WHERE load option:

SQL
1=1
 
QUALIFY ROW_NUMBER()
  OVER (
    PARTITION BY <target_entity_transformation_by_dv_driving_key>
    ORDER BY <source_entity_attribute_by_{attribute_type}> DESC
  ) = 1

Where:

  • 1=1 returns true for the WHERE clause.

  • target_entity_transformation_by_dv_driving_key variable is translated into the dv_driving_key transformation mapped in the entity mapping.

  • source_entity_attribute_by_{attribute_type} variable returns a reference to the staging entity attribute that is used for ordering the incoming data into the correct order. Note that you will need to define a technical type for the staging attribute in the staging entity and update it to the variable reference here, e.g. source_entity_attribute_by_stg_load_time.

Example of the pattern in a generated load:

SQL
...
SELECT
  ... 
FROM
  ...
WHERE
  1=1
QUALIFY ROW_NUMBER()
  OVER (
    PARTITION_BY MD5(UPPER(NVL(NULLIF(TRIM(CAST(salesorderdetailid AS VARCHAR)), ''), '-1')))
    ORDER BY stg_load_time DESC
  ) = 1
...
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.