Importing Initial Metadata

When starting to work with Agile Data Engine, you will need initial metadata about entities to start the work. How to get the initial metadata depends on the project and case. In this guide, few example cases are presented.

Cases for importing metadata

In the following diagram, a common pattern to load data from source systems to Data Warehouse is visualized.

Depending on the use case and the project, the metadata for ADE entities can be fetched in either of the sections.

From source systems, if the source system is a relational database
1. Metadata query can be executed manually to source system or
2. Metadata queries could be added to integration tool
From file-objects in Cloud Storage
From target database itself, if data ingestion is done outside ADE

1. From source system

In this case, there is a source database, where data is extracted to cloud storage with integration tool. To get the schemas right in the target database with Agile Data Engine, the most useful way in this case is to directly query the information schema in the source database.

To be able to get metadata from source database, you will need to:

Run SQL query to the source database
Export results as CSV-file, according to specified CSV-format
Import entities in CSV to ADE

Agile Data Engine is never directly connected to source systems, therefore the metadata import needs to be done with CSV-files.

In some cases, it can make sense to add entity import queries to data integration tool.

Examples

See example queries in Github.

See example video about the process:

https://youtu.be/lcgqXflBcQ8

2. From source files

In this case, there is no direct access to source systems and schemas need to be processed from the files in cloud storage or from locally downloaded files.

To be able to get metadata from source files, you need to:

Profile the source file with some tool or script
Export results as CSV-file, according to specified CSV-format
Import entities in CSV to ADE

Examples

See example scripts in Github.

3. From existing data in target database

In this case, data has already been ingested to target database and Agile Data Engine is used for data development and transformations. To populate metadata definition of existing schemas to ADE, it is possible to query the information schema in the target database.

To be able to get metadata from target database, you need to:

Run SQL query to target database
Export results as CSV-file, according to specified CSV-format
Import entities in CSV to ADE

Examples

See example queries in Github.

Recommended metadata format

With entity import, it is possible to import any entity metadata, including data warehouse objects or dimensional model objects. However, as a starting point it is useful to import source-entities only.

Recommended values for initial metadata imports, as defined in entity import:

entity_physical_type = METADATA_ONLY
entity_type = SOURCE
entity_dv_source = '<Set source system as in CONFIG_SYSTEMS>'
entity_schema = '<Either src or same as in source system>'