Skip to main content
Skip table of contents

SCD Type 2 with FTL Templates

A Slowly Changing Dimension Type 2 (SCD2) is a common data warehousing technique used to track historical changes in dimension data over time. Instead of overwriting existing records when an attribute changes, SCD2 adds a new row with the updated values while retaining the old record:

vessel_id

name

type

flag_state

valid_from

valid_to

validity_status

123

Sea Explorer

Cargo

Liberia

2023-01-01 00:00:00

2023-06-15 00:00:00

0

123

Sea Explorer

Cargo

Panama

2023-06-15 00:00:00

2024-04-01 00:00:00

0

123

Sea Explorer XL

Cargo

Panama

2024-04-01 00:00:00

9999-12-31 00:00:00

1

This approach enables full historical traceability and supports time-based analysis.

Each record typically includes technical metadata such as:

  • valid_from: the timestamp when the version became active

  • valid_to: the timestamp when the version was closed (or a placeholder for active rows)

  • validity_status: a flag indicating if the record is the current version

This example implements SCD2 logic using an FTL template.

FTL template

Copy and save the FTL template to CONFIG_LOAD_TEMPLATES:

SCD2 FTL template
CODE
<#--
  SCD Type 2 Load Template (Using Preprocessed Temporary Table)

  Description:
  This template implements a Slowly Changing Dimension Type 2 (SCD2) load logic using a temporary input table
  that contains the latest versions of source records. The input temp table must be prepared in a prior step
  using OVERRIDE SQL defined by the user.

  Expected Input:
    - Parameter: temp_table_prefix (REQUIRED) - Temp table name is expected to be same as the target entity name
      but with an additional prefix. Give the prefix as parameter when calling the template.
    - The temp table must contain all business attributes defined in the target SCD2 table,
      excluding technical attributes with the following types:
        - SCD_ID
        - DV_HASHKEY
        - VALID_TO
        - VALIDITY_STATUS
      These technical columns are automatically generated by this template.

    - The attribute for type VALID_FROM must be included in the temp table.
      It may be derived using either CURRENT_TIMESTAMP or an appropriate source attribute.

  Behavior:
    - Filters out records from the input temp table that already exist in the target table with matching business keys and data hash.
    - Generates new SCD2 rows for only the changed data.
    - Adds technical metadata such as SCD_ID, DATAHASH, VALID_FROM, VALID_TO, and VALIDITY_STATUS.
    - Performs a MERGE to:
        - Update existing rows in the target to close validity windows.
        - Insert new rows with open-ended validity.

    Use Case:
    - Suitable for publish dimension loads or other SCD2 targets.
    - Helps simplify SCD2 loads by automating the validity window & merge logic. The user only needs to write 
      the logic for getting the latest data from source.
-->

<#assign target = model.load.target>
<#assign targetScdId = loads.getAttributeForType(target.attributes, "SCD_ID")>
<#assign targetId = loads.getAttributeForType(target.attributes, "DV_HASHKEY")>
<#assign targetHash = loads.getAttributeForType(target.attributes, "DV_DATAHASH")>
<#assign targetValidFrom = loads.getAttributeForType(target.attributes, "VALID_FROM")>
<#assign targetValidTo = loads.getAttributeForType(target.attributes, "VALID_TO")>
<#assign targetValidityStatus = loads.getAttributeForType(target.attributes, "VALIDITY_STATUS")>
<#assign scdIdAttributes = target.attributes?filter(attr -> [
    "DV_HASHKEY"
    ,"VALID_FROM"
]?seq_contains(attr.attributeType!))>
<#assign targetAttributesFiltered = target.attributes?filter(attr -> ![
    "SCD_ID"
    ,"DV_DATAHASH"
    ,"VALID_FROM"
    ,"VALID_TO"
    ,"VALIDITY_STATUS"
]?seq_contains(attr.attributeType!))>


MERGE INTO <target_schema>.<target_entity_name> trg USING (
    WITH src_tmp AS (
        -- Changed data from source temp table
        SELECT
            *
        FROM (
            SELECT
                <@sql.generateEntityAttributeList attributes = targetAttributesFiltered indentation = 4/>
                , ${loads.getTransformationForAttributes("REFERENCE_HASH_KEY", scdIdAttributes)} AS ${targetScdId.name}
                , ${loads.getTransformationForAttributes("REFERENCE_HASH_DIFF", targetAttributesFiltered)} AS ${targetHash.name}
                , ${targetValidFrom.name}
                , CAST(null AS ${targetValidTo.physicalDatatype}) AS ${targetValidTo.name}
                , CAST(null AS ${targetValidityStatus.physicalDatatype}) AS ${targetValidityStatus.name}
            FROM
                <target_schema>.${temp_table_prefix}<target_entity_name>
        ) src
        WHERE
            NOT EXISTS (
                SELECT
                    1
                FROM
                    <target_schema>.<target_entity_name> trg
                WHERE
                    trg.validity_status = 1
                    AND trg.${targetId.name} = src.${targetId.name}
                    AND trg.${targetHash.name} = src.${targetHash.name}
            )
    )
    -- Rows to be merged
    SELECT 
        <@sql.generateEntityAttributeList
            attributes = target.attributes?filter(attr -> !["VALID_TO", "VALIDITY_STATUS"]?seq_contains(attr.attributeType!))
            indentation = 2
        />
        -- Set valid_to as the valid_from of the next version (by ID), or '9999-12-31' if no next version exists
        , IFNULL(
            LEAD(${targetValidFrom.name}) OVER (PARTITION BY ${targetId.name} ORDER BY ${targetValidFrom.name})
            , CAST('9999-12-31' AS ${targetValidTo.physicalDatatype})
        ) AS valid_to
        -- Only the latest version (by ID) is valid
        , CASE
            WHEN LEAD(${targetValidFrom.name}) OVER (PARTITION BY ${targetId.name} ORDER BY ${targetValidFrom.name}) IS NULL 
            THEN 1
            ELSE 0
        END AS validity_status
    
    FROM (
        -- Select rows from the target entity that require validity updates
        SELECT
            <@sql.generateEntityAttributeList attributes = target.attributes indentation = 3/>
        FROM
            <target_schema>.<target_entity_name> trg
        WHERE
            trg.validity_status = 1
            AND EXISTS (
                SELECT
                    1
                FROM
                    src_tmp
                WHERE
                    trg.${targetId.name} = src_tmp.${targetId.name}
            )
        
        -- Union all changed data from source
        UNION ALL
        SELECT
            <@sql.generateEntityAttributeList attributes = target.attributes indentation = 3/>
        FROM
            src_tmp
    )
) src
ON (
    trg.${targetId.name} = src.${targetId.name}
    AND trg.${targetHash.name} = src.${targetHash.name}
    AND trg.${targetValidFrom.name} = src.${targetValidFrom.name}
)
WHEN MATCHED THEN UPDATE SET
    trg.${targetValidTo.name} = src.${targetValidTo.name}
    , trg.${targetValidityStatus.name} = src.${targetValidityStatus.name}
WHEN NOT MATCHED THEN INSERT (
    <@sql.generateEntityAttributeList attributes = target.attributes indentation = 1/>
)
VALUES (
    <@sql.generateEntityAttributeList attributes = target.attributes indentation = 1/>
)

Configure parameter temp_table_prefix (STRING) for the template. Go to the Package Details tab and commit your changes.

This template has been tested in Google BigQuery, Databricks SQL and Snowflake. Modify it for other target databases as needed.

Target table requirements

  • Attributes with technical types:

    • SCD_ID

    • DV_HASHKEY

    • DV_DATAHASH

    • VALID_FROM

    • VALID_TO

    • VALIDITY_STATUS

  • Any number of business attributes

Technical attribute types can be defined in CONFIG_ENTITY_DEFAULTS. It is also recommended to create a custom entity type for SCD2, see example below.

Input requirements

  • The template expects a temporary table with the same name (incl. schema) as the target entity but with an additional prefix. Give this prefix as a parameter when calling the template.

  • The temporary table should contain the latest version of records from the source(s). The assumption is that this temp table is populated in a previous load step, using custom SQL logic provided by the user. The FTL template then handles the merge to the target SCD2 table.

  • The temporary table must include:

    • An id attribute. The attribute name must match the name defined in the target entity for the DV_HASHKEY technical type.

    • A valid_from timestamp. Use either a source-derived or a system-generated timestamp. The attribute name must match the name defined in the target entity for the VALID_FROM technical type.

    • All business attributes found in the target entity with the same names.

  • The temporary table must exclude attributes for these technical types:

    • SCD_ID

    • DV_DATAHASH

    • VALID_TO

    • VALIDITY_STATUS
      These excluded attributes are automatically generated by the template.

What the template does

  • Filters out unchanged records from the incoming data (based on DV_HASHKEY, DV_DATAHASH and target VALIDITY_STATUS).

  • Adds new SCD2 rows only for changed records.

  • Automatically sets:

    • SCD_ID (compound hash key from DV_HASHKEY and VALID_FROM)

    • DV_DATAHASH (hash diff for change detection)

    • VALID_FROM (as set in the temporary table)

    • VALID_TO (as the next version’s VALID_FROM, or open-ended 9999-12-31)

    • VALIDITY_STATUS (1 for current version, 0 otherwise)

  • Uses a MERGE to:

    • Update existing records to close out their validity windows

    • Insert new records for the changed data

Example

Configuring a new entity type

You can create a new entity type for SCD2 with default attributes in CONFIG_ENTITY_DEFAULTS:

  1. Configure attribute types in Entity Defaults - Attribute Types:

    • SCD_ID

    • VALID_FROM

    • VALID_TO

    • VALIDITY_STATUS

  2. Configure default attributes in Entity Defaults - Default Attributes:

Attribute name template

Data type

Attribute type

<entity_logical_name>_scd_id

CHAR(32)

SCD_ID

<entity_logical_name>_id

CHAR(32)

DV_HASHKEY

datahash

CHAR(32)

DV_DATAHASH

valid_from

TIMESTAMP

VALID_FROM

valid_to

TIMESTAMP

VALID_TO

validity_status

INTEGER1

VALIDITY_STATUS

Note that you are free to choose the naming convention. However, the attribute types are directly referenced in the FTL template so modifying those requires also changes to the template definition.

  1. Configure entity type DIM_SCD2 in Entity Defaults - Entity Types with:

    • Default physical type: TABLE

    • Naming & default schema as desired

    • Click the entity type and add default attributes in Entity Default Attributes:

      • <entity_logical_name>_scd_id

      • <entity_logical_name>_id

      • datahash

      • valid_from

      • valid_to

      • validity_status

    • In Default Load Types add default load type:

      • Entity physical type: TABLE

      • Load type: TRANSFORM_PERSIST

  2. Finally, go to the Package Details tab and commit your changes.

Creating an SCD2 entity

In this example, we use the previously configured scd2 FTL template together with the DIM_SCD2 entity type to create an SCD2 entity. The use case focuses on marine data, where we load a vessel dimension that tracks details about marine vessels over time.

  1. A new entity with type DIM_SCD2 and name D_VESSEL_SCD2 is created.

  2. Default attributes are automatically added:

    • vessel_scd_id

    • vessel_id

    • datahash

    • valid_from

    • valid_to

    • validity_status

  3. A load is created with the following steps:

#

Load step name

Logic

1

drop_temp_start

SQL
DROP TABLE IF EXISTS <target_schema>.Z_TMP_<target_entity_name>;

2

create_temp_table

SQL
CREATE TEMPORARY TABLE <target_schema>.Z_TMP_<target_entity_name> AS
SELECT ...

This step loads the latest version of records from the source(s) and stores them in a temporary table.

Note that the CREATE TEMPORARY TABLE syntax is not supported in all databases, use CREATE TABLE instead.

3

merge_scd2_with_ftl

CODE
<ftltemplate:scd2:temp_table_prefix=Z_TMP_>

4

drop_temp_end

SQL
DROP TABLE IF EXISTS <target_schema>.Z_TMP_<target_entity_name>;

In this example, we are loading the temporary table from data vault entities:

create_temp_table
SQL
CREATE TEMPORARY TABLE <target_schema>.Z_TMP_<target_entity_name> AS
WITH voyages_digi AS (
    SELECT DISTINCT 
        dv_id_marine_vessel_mmsi
        ,dv_id_marine_vessel_imo
        ,dv_id_marine_vessel_type
    FROM rdv.l_t_marine_ais_voyage_digitraffic
)
,voyages_barw AS (
    SELECT DISTINCT 
        dv_id_marine_vessel_mmsi
        ,dv_id_marine_vessel_imo
        ,dv_id_marine_vessel_type
    FROM rdv.l_t_marine_ais_voyage_barentswatch
)
,vessel AS (
    SELECT 
        h_vessel.dv_id AS vessel_id
        ,h_vessel.business_key AS mmsi_key
        ,COALESCE(s_vessel_digi.name, s_vessel_barw.name) AS vessel_name
        ,COALESCE(s_vessel_type_a.type_hierarchy, s_vessel_type_b.type_hierarchy) AS vessel_type_code
        ,COALESCE(s_vessel_type_a.hierarchy_description, s_vessel_type_b.hierarchy_description) AS vessel_type
        ,r_country.country_code AS country_code
        ,r_country.country AS country
        ,h_vessel.dv_source_system AS source_system
    FROM
        rdv.h_marine_vessel_mmsi h_vessel
    LEFT JOIN
        rdv.r_marine_vessel_country r_country
        ON LEFT(h_vessel.business_key, 3) = r_country.mid
    LEFT JOIN
        rdv.s_marine_vessel_mmsi_digitraffic_c s_vessel_digi
        ON h_vessel.dv_id = s_vessel_digi.dv_id
    LEFT JOIN
        rdv.s_marine_vessel_mmsi_barentswatch_c s_vessel_barw
        ON h_vessel.dv_id = s_vessel_barw.dv_id
    LEFT JOIN
        voyages_digi
        ON h_vessel.dv_id = voyages_digi.dv_id_marine_vessel_mmsi
    LEFT JOIN
        rdv.s_marine_vessel_type_reference_c s_vessel_type_a
        ON s_vessel_type_a.dv_id = voyages_digi.dv_id_marine_vessel_type
    LEFT JOIN
        voyages_barw
        ON h_vessel.dv_id = voyages_barw.dv_id_marine_vessel_mmsi
    LEFT JOIN
        rdv.s_marine_vessel_type_reference_c s_vessel_type_b
        ON s_vessel_type_b.dv_id = voyages_barw.dv_id_marine_vessel_type
)
SELECT 
    *
    ,CURRENT_TIMESTAMP AS valid_from
FROM vessel;

Note that we are using CURRENT_TIMESTAMP as the VALID_FROM timestamp. Alternatively, we could use a timestamp from the source data if we can trust it.

Generated load:

Generated SQL
SQL
/* 1. drop_temp_start (OVERRIDE - SQL) */
DROP TABLE IF EXISTS publish.Z_TMP_D_VESSEL_SCD2;

/* 2. create_temp_table (OVERRIDE - SQL) */
CREATE TEMPORARY TABLE publish.Z_TMP_D_VESSEL_SCD2 AS
WITH voyages_digi AS (
    SELECT DISTINCT 
        dv_id_marine_vessel_mmsi
        ,dv_id_marine_vessel_imo
        ,dv_id_marine_vessel_type
    FROM rdv.l_t_marine_ais_voyage_digitraffic
)
,voyages_barw AS (
    SELECT DISTINCT 
        dv_id_marine_vessel_mmsi
        ,dv_id_marine_vessel_imo
        ,dv_id_marine_vessel_type
    FROM rdv.l_t_marine_ais_voyage_barentswatch
)
,vessel AS (
    SELECT 
        h_vessel.dv_id AS vessel_id
        ,h_vessel.business_key AS mmsi_key
        ,COALESCE(s_vessel_digi.name, s_vessel_barw.name) AS vessel_name
        ,COALESCE(s_vessel_type_a.type_hierarchy, s_vessel_type_b.type_hierarchy) AS vessel_type_code
        ,COALESCE(s_vessel_type_a.hierarchy_description, s_vessel_type_b.hierarchy_description) AS vessel_type
        ,r_country.country_code AS country_code
        ,r_country.country AS country
        ,h_vessel.dv_source_system AS source_system
    FROM
        rdv.h_marine_vessel_mmsi h_vessel
    LEFT JOIN
        rdv.r_marine_vessel_country r_country
        ON LEFT(h_vessel.business_key, 3) = r_country.mid
    LEFT JOIN
        rdv.s_marine_vessel_mmsi_digitraffic_c s_vessel_digi
        ON h_vessel.dv_id = s_vessel_digi.dv_id
    LEFT JOIN
        rdv.s_marine_vessel_mmsi_barentswatch_c s_vessel_barw
        ON h_vessel.dv_id = s_vessel_barw.dv_id
    LEFT JOIN
        voyages_digi
        ON h_vessel.dv_id = voyages_digi.dv_id_marine_vessel_mmsi
    LEFT JOIN
        rdv.s_marine_vessel_type_reference_c s_vessel_type_a
        ON s_vessel_type_a.dv_id = voyages_digi.dv_id_marine_vessel_type
    LEFT JOIN
        voyages_barw
        ON h_vessel.dv_id = voyages_barw.dv_id_marine_vessel_mmsi
    LEFT JOIN
        rdv.s_marine_vessel_type_reference_c s_vessel_type_b
        ON s_vessel_type_b.dv_id = voyages_barw.dv_id_marine_vessel_type
)
SELECT 
    *
    ,CURRENT_TIMESTAMP AS valid_from
FROM vessel;

/* 3. merge_scd2_with_ftl (OVERRIDE - SQL) */
MERGE INTO publish.D_VESSEL_SCD2 trg USING (
    WITH src_tmp AS (
        -- Changed data from source temp table
        SELECT
            *
        FROM (
            SELECT
                vessel_id
                , mmsi_key
                , vessel_name
                , vessel_type_code
                , vessel_type
                , country_code
                , country
                , source_system
                , MD5(UPPER(NVL(NULLIF(TRIM(vessel_id), ''), '-1') || '~' || NVL(TO_VARCHAR(valid_from, 'YYYY-MM-DDTHH24:MI:SS.FF6'), '-1'))) AS vessel_scd_id
                , MD5(NVL(vessel_id, '-1') || '~' || NVL(CAST(mmsi_key AS VARCHAR), '-1') || '~' || NVL(vessel_name, '-1') || '~' || NVL(CAST(vessel_type_code AS VARCHAR), '-1') || '~' || NVL(vessel_type, '-1') || '~' || NVL(country_code, '-1') || '~' || NVL(country, '-1') || '~' || NVL(source_system, '-1')) AS datahash
                , valid_from
                , CAST(null AS TIMESTAMP) AS valid_to
                , CAST(null AS TINYINT) AS validity_status
            FROM
                publish.Z_TMP_D_VESSEL_SCD2
        ) src
        WHERE
            NOT EXISTS (
                SELECT
                    1
                FROM
                    publish.D_VESSEL_SCD2 trg
                WHERE
                    trg.validity_status = 1
                    AND trg.vessel_id = src.vessel_id
                    AND trg.datahash = src.datahash
            )
    )
    -- Rows to be merged
    SELECT 
        vessel_scd_id
        , vessel_id
        , datahash
        , valid_from
        , mmsi_key
        , vessel_name
        , vessel_type_code
        , vessel_type
        , country_code
        , country
        , source_system
        -- Set valid_to as the valid_from of the next version (by ID), or '9999-12-31' if no next version exists
        , IFNULL(
            LEAD(valid_from) OVER (PARTITION BY vessel_id ORDER BY valid_from)
            , CAST('9999-12-31' AS TIMESTAMP)
        ) AS valid_to
        -- Only the latest version (by ID) is valid
        , CASE
            WHEN LEAD(valid_from) OVER (PARTITION BY vessel_id ORDER BY valid_from) IS NULL 
            THEN 1
            ELSE 0
        END AS validity_status
	
    FROM (
        -- Select rows from the target entity that require validity updates
        SELECT
            vessel_scd_id
            , vessel_id
            , datahash
            , valid_from
            , valid_to
            , validity_status
            , mmsi_key
            , vessel_name
            , vessel_type_code
            , vessel_type
            , country_code
            , country
            , source_system
        FROM
            publish.D_VESSEL_SCD2 trg
        WHERE
            trg.validity_status = 1
            AND EXISTS (
                SELECT
                    1
                FROM
                    src_tmp
                WHERE
                    trg.vessel_id = src_tmp.vessel_id
            )
        
        -- Union all changed data from source
        UNION ALL
        SELECT
            vessel_scd_id
            , vessel_id
            , datahash
            , valid_from
            , valid_to
            , validity_status
            , mmsi_key
            , vessel_name
            , vessel_type_code
            , vessel_type
            , country_code
            , country
            , source_system
        FROM
            src_tmp
    )
) src
ON (
    trg.vessel_id = src.vessel_id
    AND trg.datahash = src.datahash
    AND trg.valid_from = src.valid_from
)
WHEN MATCHED THEN UPDATE SET
    trg.valid_to = src.valid_to
    , trg.validity_status = src.validity_status
WHEN NOT MATCHED THEN INSERT (
    vessel_scd_id
    , vessel_id
    , datahash
    , valid_from
    , valid_to
    , validity_status
    , mmsi_key
    , vessel_name
    , vessel_type_code
    , vessel_type
    , country_code
    , country
    , source_system
)
VALUES (
    vessel_scd_id
    , vessel_id
    , datahash
    , valid_from
    , valid_to
    , validity_status
    , mmsi_key
    , vessel_name
    , vessel_type_code
    , vessel_type
    , country_code
    , country
    , source_system
);

/* 4. drop_temp_end (OVERRIDE - SQL) */
DROP TABLE IF EXISTS publish.Z_TMP_D_VESSEL_SCD2;

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.