Configuring Target Databases
Guide objective
This guide helps you set up a target database for Agile Data Engine.
Agile Data Engine supports the following database products as a target database:
Azure Synapse SQL (dedicated SQL pool, formerly Azure SQL DW)
Please refer to the database product specific instructions below to set up a target database for an Agile Data Engine Runtime environment.
Amazon Redshift
Contact our Professional Services for examples via support portal.
Azure SQL Database
Set up a separate Azure SQL Database for each Agile Data Engine Runtime. Within the databases, you need to set up the following objects:
LOGIN (master database)
MASTER KEY
ROLE
USER
EXTERNAL DATA SOURCE (optional)
Repeat below steps in each environment. Feel free to name objects according to your naming conventions.
Execute step 1 in the SQL server master database.
Execute steps 2-7 in the target database.
Create a login:
SQLCREATE LOGIN ADE_LOGIN WITH PASSWORD = '***';
Generate strong environment specific passwords and store them securely.
Create a master key:
SQLCREATE MASTER KEY ENCRYPTION BY PASSWORD = '***';
Refer to Microsoft documentation for more information about the master key.
Create a role:
SQLCREATE ROLE ADE_ROLE;
Grant permissions to the role:
SQLGRANT CONTROL ON DATABASE::[CLOUDDW-DEV] TO ADE_ROLE; GRANT ALTER ANY SCHEMA TO ADE_ROLE; GRANT CREATE TABLE TO ADE_ROLE; GRANT CREATE VIEW TO ADE_ROLE;
Replace CLOUDDW-DEV with the target database name.
Create a user for the login:
SQLCREATE USER [ADE_USER] FOR LOGIN [ADE_LOGIN] WITH DEFAULT_SCHEMA = dbo;
Assign the role to the user:
SQLEXEC sp_addrolemember 'ADE_ROLE', 'ADE_USER';
Optionally, create an EXTERNAL DATA SOURCE by following the Microsoft documentation. For more information, please refer to Loading Source Data.
Azure Synapse SQL
Set up a separate database for each Agile Data Engine Runtime. Within the databases, you need to set up the following objects:
LOGIN (master database)
MASTER KEY
ROLE
USER
Repeat below steps in each environment. Feel free to name objects according to your naming conventions.
Execute step 1 in the master database.
Execute steps 2-6 in the target database.
Create a login:
SQLCREATE LOGIN ADE_LOGIN WITH PASSWORD = '***';
Generate strong environment specific passwords and store them securely.
Create a master key:
SQLCREATE MASTER KEY ENCRYPTION BY PASSWORD = '***';
Refer to Microsoft documentation for more information about the master key.
Create a role:
SQLCREATE ROLE ADE_ROLE;
Grant permission to the role:
SQLGRANT CONTROL ON DATABASE::[CLOUDDW-DEV] TO ADE_ROLE; GRANT ALTER ANY SCHEMA TO ADE_ROLE; GRANT CREATE TABLE TO ADE_ROLE; GRANT CREATE VIEW TO ADE_ROLE;
Replace CLOUDDW-DEV with the target database name.
Create a user for the login:
SQLCREATE USER [ADE_USER] FOR LOGIN [ADE_LOGIN] WITH DEFAULT_SCHEMA = dbo;
Assign the role to the user:
SQLEXEC sp_addrolemember 'ADE_ROLE', 'ADE_USER';
Databricks SQL
Currently Agile Data Engine is tested to work with Databricks SQL on Azure. It requires Unity Catalog to set up and Agile Data Engine creates all the tables as Delta tables with properties
'delta.columnMapping.mode' = 'name'
'delta.minReaderVersion' = '2'
'delta.minWriterVersion' = '5'
'delta.feature.allowColumnDefaults' = 'supported'
Data loading is supported with managed identity and data location needs to be defined as external location:https://learn.microsoft.com/en-us/azure/databricks/data-governance/unity-catalog/manage-external-locations-and-credentials#manage-storage-credentials
For further examples and Terraform templates contact our Professional Services via support portal.
Google BigQuery
Set up a separate Google Cloud Project for each Agile Data Engine Runtime. Within the projects, you need to set up the following:
Service account
Custom IAM role assigned to the service account
This IAM role needs accesses to BigQuery jobs, datasets, tables, routines and connections
GCP data bucket read access to the service account
It is recommended to use Terraform to set up these GCP resources.
Terraform template, which is using Terraform workspaces.
# Service account
resource "google_service_account" "ade_account" {
project = var.project
account_id = "sa-${var.bigquery_app}-${terraform.workspace}"
display_name = "Service account for ${var.bigquery_app}-${terraform.workspace}"
}
# Role grants
resource "google_project_iam_custom_role" "ade_role" {
role_id = "adeRole"
title = "ADE role"
description = "Role for ADE usage"
permissions = [
"bigquery.jobs.create",
"bigquery.jobs.get",
"bigquery.jobs.list",
"bigquery.jobs.listAll",
"bigquery.jobs.delete",
"bigquery.jobs.update",
"bigquery.datasets.create",
"bigquery.datasets.createTagBinding",
"bigquery.datasets.delete",
"bigquery.datasets.deleteTagBinding",
"bigquery.datasets.get",
"bigquery.datasets.getIamPolicy",
"bigquery.datasets.link",
"bigquery.datasets.listTagBindings",
"bigquery.datasets.setIamPolicy",
"bigquery.datasets.update",
"bigquery.tables.create",
"bigquery.tables.createIndex",
"bigquery.tables.createSnapshot",
"bigquery.tables.delete",
"bigquery.tables.deleteIndex",
"bigquery.tables.deleteSnapshot",
"bigquery.tables.export",
"bigquery.tables.get",
"bigquery.tables.getData",
"bigquery.tables.getIamPolicy",
"bigquery.tables.list",
"bigquery.tables.restoreSnapshot",
"bigquery.tables.setCategory",
"bigquery.tables.setIamPolicy",
"bigquery.tables.update",
"bigquery.tables.updateData",
"bigquery.connections.get",
"bigquery.connections.getIamPolicy",
"bigquery.connections.list",
"bigquery.connections.use",
"bigquery.routines.create",
"bigquery.routines.delete",
"bigquery.routines.get",
"bigquery.routines.list",
"bigquery.routines.update",
"bigquery.routines.updateTag"
]
}
resource "google_project_iam_binding" "ade_role_to_user" {
project = var.project
role = "projects/${var.project}/roles/${google_project_iam_custom_role.ade_role.role_id}"
members = [
"serviceAccount:${google_service_account.ade_account.email}"
]
}
# Bucket IAM binding
resource "google_storage_bucket_iam_binding" "databucket_binding" {
bucket = google_storage_bucket.bucket.name
role = "roles/storage.objectViewer"
members = ["serviceAccount:${google_service_account.ade_account.email}"]
}
Microsoft Fabric
Contact our Professional Services for examples via support portal.
Snowflake
To connect an Agile Data Engine Runtime to Snowflake, you need to set up the following objects on your Snowflake account:
USER
ROLE
DATABASE
WAREHOUSE
STORAGE INTEGRATION (optional)
Below examples presume that the default role setup in Snowflake is used. If the role setup in your Snowflake account is custom, please adapt the examples accordingly. Also feel free to name objects according to your naming conventions.
Create a database for each ADE Runtime environment:
SQLUSE ROLE SYSADMIN; CREATE DATABASE CLOUDDW_DEV; CREATE DATABASE CLOUDDW_TEST; CREATE DATABASE CLOUDDW_PROD;
Create a warehouse for each ADE Runtime environment:
SQLUSE ROLE SYSADMIN; CREATE WAREHOUSE WH_ADE_DEV WITH WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 60 AUTO_RESUME = TRUE INITIALLY_SUSPENDED = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 1 SCALING_POLICY = 'STANDARD' COMMENT = 'Agile Data Engine DEV'; CREATE WAREHOUSE WH_ADE_TEST WITH WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 60 AUTO_RESUME = TRUE INITIALLY_SUSPENDED = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 1 SCALING_POLICY = 'STANDARD' COMMENT = 'Agile Data Engine TEST'; CREATE WAREHOUSE WH_ADE_PROD WITH WAREHOUSE_SIZE = 'XSMALL' AUTO_SUSPEND = 60 AUTO_RESUME = TRUE INITIALLY_SUSPENDED = TRUE MIN_CLUSTER_COUNT = 1 MAX_CLUSTER_COUNT = 1 SCALING_POLICY = 'STANDARD' COMMENT = 'Agile Data Engine PROD';
Customize warehouse settings as needed.
Note that it is optional to create separate warehouses for each environment; you could also use just one warehouse. However, having environment specific warehouses enables more control, easier scalability, and better traceability of costs. It is even possible to configure additional e.g. use case specific warehouses (in addition to the default environment specific warehouses), see CONFIG_LOAD_SCHEDULES and PRE load steps for more information.
Create a role for each Runtime environment:
SQLUSE ROLE USERADMIN; CREATE ROLE ROLE_ADE_DEV COMMENT = 'Agile Data Engine DEV'; CREATE ROLE ROLE_ADE_TEST COMMENT = 'Agile Data Engine TEST'; CREATE ROLE ROLE_ADE_PROD COMMENT = 'Agile Data Engine PROD';
It is a good practice to have environment specific roles.
Grant the created roles to a higher role to ensure a continuous role hierarchy which is a good practice:
SQLUSE ROLE USERADMIN; GRANT ROLE ROLE_ADE_DEV TO ROLE SYSADMIN; GRANT ROLE ROLE_ADE_TEST TO ROLE SYSADMIN; GRANT ROLE ROLE_ADE_PROD TO ROLE SYSADMIN;
Create a user for each Runtime environment with defaults referring the previously created objects:
SQLUSE ROLE USERADMIN; CREATE USER S_ADE_DEV PASSWORD = '***' COMMENT = 'Service account for Agile Data Engine DEV.' DEFAULT_ROLE = ROLE_ADE_DEV DEFAULT_WAREHOUSE = WH_ADE_DEV; CREATE USER S_ADE_TEST PASSWORD = '***' COMMENT = 'Service account for Agile Data Engine TEST.' DEFAULT_ROLE = ROLE_ADE_TEST DEFAULT_WAREHOUSE = WH_ADE_TEST; CREATE USER S_ADE_PROD PASSWORD = '***' COMMENT = 'Service account for Agile Data Engine PROD.' DEFAULT_ROLE = ROLE_ADE_PROD DEFAULT_WAREHOUSE = WH_ADE_PROD;
Generate strong environment specific passwords and store them securely.
Grant the roles to the users:
SQLUSE ROLE USERADMIN; GRANT ROLE ROLE_ADE_DEV TO USER S_ADE_DEV; GRANT ROLE ROLE_ADE_TEST TO USER S_ADE_TEST; GRANT ROLE ROLE_ADE_PROD TO USER S_ADE_PROD;
Grant privileges on the created objects to the ADE roles:
SQLUSE ROLE SYSADMIN; -- Warehouses GRANT MODIFY, OPERATE, USAGE ON WAREHOUSE WH_ADE_DEV TO ROLE ROLE_ADE_DEV; GRANT MODIFY, OPERATE, USAGE ON WAREHOUSE WH_ADE_TEST TO ROLE ROLE_ADE_TEST; GRANT MODIFY, OPERATE, USAGE ON WAREHOUSE WH_ADE_PROD TO ROLE ROLE_ADE_PROD; -- Databases GRANT USAGE, CREATE SCHEMA ON DATABASE CLOUDDW_DEV TO ROLE ROLE_ADE_DEV; GRANT USAGE, CREATE SCHEMA ON DATABASE CLOUDDW_TEST TO ROLE ROLE_ADE_TEST; GRANT USAGE, CREATE SCHEMA ON DATABASE CLOUDDW_PROD TO ROLE ROLE_ADE_PROD;
If you are using a STORAGE INTEGRATION to load source data into Snowflake, grant privileges to the ADE roles to use the storage integration:
SQLUSE ROLE ACCOUNTADMIN; GRANT USAGE ON INTEGRATION ADE_SOURCE_DATA TO ROLE ROLE_ADE_DEV; GRANT USAGE ON INTEGRATION ADE_SOURCE_DATA TO ROLE ROLE_ADE_TEST; GRANT USAGE ON INTEGRATION ADE_SOURCE_DATA TO ROLE ROLE_ADE_PROD;
Replace ADE_SOURCE_DATA with the name of your storage integration(s). For more information, please refer to Loading Source Data.
Here is a summary of the privileges granted to Agile Data Engine in Snowflake:
Object | Required privileges | Optional privileges | Notes |
---|---|---|---|
DATABASE | USAGE, CREATE SCHEMA | - | By default, ADE creates target schemas into the database thereby getting schema ownership. If schemas already exist, sufficient schema privileges need to be granted to the ADE role. |
WAREHOUSE | OPERATE, USAGE | MODIFY | MODIFY privilege enables warehouse scaling within workflows. |
STORAGE INTEGRATION | USAGE | - | - |