Entity mapping types
Entity mapping type and position define the role of an entity mapping in a load. Entity mapping types can be used to:
Enable automatic load generation in loads with multiple entity mappings when one mapping is set as a source and other mappings set as dependencies.
Define which source entity mapping drives Run ID Logic in loads with multiple entity mappings.
See also:
Usage
Entity mapping type and position are selected per entity mapping in the Entity Mappings section of the Load view. The following entity mapping types are available:
Entity mapping type | Description |
---|---|
SOURCE | Default entity mapping type. Mapped source entity is considered as a source for the load. Automatic load generation is available when a load has one entity mapping of type SOURCE or SOURCE (Driving Run ID Logic). When using OVERRIDE load steps, there can be multiple entity mappings of type SOURCE. |
SOURCE (Driving Run ID Logic) | Available when load option OPT_USE_RUN_IDS = true. In loads with multiple SOURCE type entity mappings, use the entity mapping POSITION to determine which SOURCE type entity mapping is driving Run ID Logic. The entity mapping with the lowest position number will be used. Only one entity mapping can drive Run ID Logic per load. |
DEPENDENCY | Used to indicate that a load is dependent on an entity. Mapped dependencies are used to determine load order when generating load workflows. Automatic load generation is available in loads with multiple entity mappings of type DEPENDENCY and one entity mapping of type SOURCE or SOURCE (Driving Run ID Logic). |
Examples
Using Run ID Logic in a load with multiple entity mappings
In this example we are joining two data vault entities to incrementally load a flat entity using Run ID Logic.
Our use case consists of position data of marine vessels which we are receiving over time along with descriptive data about the vessels. We would like to join these two to form a flat entity of vessel positions over time with some attributes describing the vessels. We are using Run ID Logic to load new batches of the position data incrementally.
The load is defined as follows:
Target entity:
publish.FL_MARINE_AIS_POSITIONLoad option:
OPT_USE_RUN_IDS = trueEntity mappings:
Source entity | Position | Type | Description |
---|---|---|---|
rdv.L_T_MARINE_AIS_POSITION | 1 | SOURCE (Driving Run ID Logic) | Data vault transactional link table containing marine vessel position data over time. This mapping is used to drive Run ID Logic. |
rdv.S_MARINE_VESSEL_MMSI_C | 2 | SOURCE | Data vault satellite current view containing descriptive vessel identity data. |
OVERRIDE load step:
INSERT INTO <target_schema>.<target_entity_name> (
...
)
SELECT
l.mmsi AS vessel_mmsi
,s.callsign AS vessel_callsign
,s.name AS vessel_name
,s.shiplength AS vessel_length
,s.shipwidth AS vessel_width
,s.shiptype AS vessel_type
,l.latitude AS latitude
,l.longitude AS longitude
,l.speedoverground AS speed_over_ground_knots
,l.courseoverground AS course_over_ground_degree
,l.rateofturn AS rate_of_turn_degree
,l.trueheading AS heading_degree
,l.msgtime AS message_timestamp
,date_part(epoch_millisecond, l.msgtime) AS message_epoch_ms
,l.dv_load_time AS meta_load_time
,l.dv_source_system AS meta_source_system
,<targetrunid> AS meta_run_id
FROM
rdv.L_T_MARINE_AIS_POSITION l
LEFT JOIN
rdv.S_MARINE_VESSEL_MMSI_C s
ON l.dv_id_marine_vessel_mmsi = s.dv_id
WHERE
l.dv_run_id IN (<loadablerunids>);
Agile Data Engine is tracking loaded run ids per source to target entity pair. In this load we have selected the position data to drive Run ID Logic, therefore loaded run ids are tracked between rdv.L_T_MARINE_AIS_POSITION and publish.FL_MARINE_AIS_POSITION.
Note the use of target_schema, target_entity_name, targetrunid and loadablerunids variables.
Load executed by Agile Data Engine:
INSERT INTO publish.FL_MARINE_AIS_POSITION (
...
)
SELECT
l.mmsi AS vessel_mmsi
,s.callsign AS vessel_callsign
,s.name AS vessel_name
,s.shiplength AS vessel_length
,s.shipwidth AS vessel_width
,s.shiptype AS vessel_type
,l.latitude AS latitude
,l.longitude AS longitude
,l.speedoverground AS speed_over_ground_knots
,l.courseoverground AS course_over_ground_degree
,l.rateofturn AS rate_of_turn_degree
,l.trueheading AS heading_degree
,l.msgtime AS message_timestamp
,date_part(epoch_millisecond, l.msgtime) AS message_epoch_ms
,l.dv_load_time AS meta_load_time
,l.dv_source_system AS meta_source_system
,3 AS meta_run_id
FROM
rdv.L_T_MARINE_AIS_POSITION_BARENTSWATCH l
LEFT JOIN
rdv.S_MARINE_VESSEL_MMSI_BARENTSWATCH_C s
ON l.dv_id_marine_vessel_mmsi = s.dv_id
WHERE
l.dv_run_id IN (314);
Using a DEPENDENCY mapping in an automatically generated load to adjust load order
In this example we have a load that depends on multiple entities and we still want to use automatic load generation instead of OVERRIDE load steps. This is possible with the DEPENDENCY entity mapping type.
Our target entity contains contact information and when loading it we need to use an exclusion list of contacts that should not be loaded to the target entity. We need to ensure that the exclusion list is always loaded with new data before the target entity so that excluded contact ids would not be stored.
Target entity:
edw.DW_CONTACTSLoad options:
- SQL
contact_id NOT IN (SELECT excluded_contact_id FROM edw.R_EXCLUDED_CONTACTS)
OPT_HISTORIZED_BY_DATAHASH_AS_INSERT = true
(Note that this option is used just as an example, you are free to use any load logic.)
Entity mappings:
Source entity | Position | Type | Description |
---|---|---|---|
staging.STG_CONTACT | 1 | SOURCE | Staging table containing contact information. |
edw.R_EXCLUDED_CONTACTS | 2 | DEPENDENCY | Reference table containing contact ids that should not be loaded to the data warehouse. |
Due to the DEPENDENCY mapping, in the generated workflow R_EXCLUDED_CONTACTS load is placed before DW_CONTACTS load:
DW_CONTACTS load executed by Agile Data Engine:
INSERT INTO edw.DW_CONTACTS (
...
)
SELECT DISTINCT
...
FROM (
SELECT
...
FROM
staging.STG_CONTACT src_entity
WHERE
contact_id NOT IN (SELECT excluded_contact_id FROM edw.R_EXCLUDED_CONTACTS)
) src
WHERE
NOT EXISTS (
...
)
;