CONFIG_LOAD_SCHEDULES
CONFIG_LOAD_SCHEDULES is a configuration package used to define load schedules in Agile Data Engine. Schedules defined in this package translate into executable workflows in the Runtime environments' workflow orchestration.
Loads are assigned to schedules, and workflows are automatically generated based on:
The assigned schedule
Dependency relationships defined in the loads
See Designing Workflows for more details about how workflows are generated.
Key features include:
Time-based scheduling using cron expressions
Support for workflow triggers and preconditions to define dependencies between workflows
Environment-specific schedule configurations with environment variables
Schedule time zone selection
Priority control to influence workflow execution order
Terminology
In this context, the terms schedule, workflow, and DAG (Directed Acyclic Graph) are often used interchangeably.
A schedule is defined in CONFIG_LOAD_SCHEDULES.
When deployed, each schedule becomes a workflow (also referred to as a DAG) in the workflow orchestration of the Runtime environment.
These workflows encapsulate the execution logic and dependencies of assigned loads, driven by schedule and load configurations.
See also:
Tutorials
See the video for a quick tutorial on how to create a load schedule:
https://www.youtube.com/watch?v=eTT3BQIkUlsUsage
Open the CONFIG_LOAD_SCHEDULES configuration package.
Navigate to the Load Schedules tab.
Configure load schedules as needed.
You can also edit the contents of the package with Show Editor. Refer to the Contents section below for details on the available configuration structure.
After making changes, the CONFIG_LOAD_SCHEDULES package will contain uncommitted changes.
Be sure to commit and deploy the package to the Runtime environments for the configuration to take effect.
Note that entity packages that reference load schedules depend on CONFIG_LOAD_SCHEDULES. Therefore, changes to CONFIG_LOAD_SCHEDULES must be deployed before or together with the related entity packages.
Schedule name cleanup during package import
Schedule names are verified and cleaned up during the package import:
Names will be transformed to uppercase, only characters from A to Z, numbers and underscores are allowed [A-Z0-9_].
All other consecutive non-allowed characters will be substituted to one underscore.
Example 1: These examples would result in the same schedule name ADE_META_SCHEDULING_CRON:
ADE_META_SCHEDULING_CRON
ade meta scheduling cron
Ade#meta&scheduling#Cron
Example 2: These examples would be transformed to underscores:
myöhäistetty_lataus
→ will become schedulingMY_H_ISTETTY_LATAUS
loading data */5 * * * 1,2,3,4,5
→ will become scheduleLOADING_DATA_5_1_2_3_4_5
Priority weight
The priority weight setting allows you to influence the execution order of workflows within workflow orchestration.
This setting maps to Apache Airflow’s priority_weight
parameter, which determines how task instances are prioritized when worker slots are limited. Workflows with higher priority weights will be scheduled before those with lower weights, assuming all other conditions (e.g. dependencies, scheduling time) are equal.
Agile Data Engine uses the absolute
weighting method by default, meaning the value assigned is directly used as-is; larger numbers equal higher priority. For example:
A workflow with a priority weight of
10
will be scheduled before one with a weight of5
, when competing for execution resources.
If no priority is defined, the default is 1
, which gives the workflow equal priority with others.
For more details, refer to the Airflow documentation on priority weight.
Triggered schedules
Setting up triggered schedules allows you to define downstream schedules that will automatically start after the source workflow has completed.
You can add triggered schedules from the Summary view of a load schedule by selecting Add triggered schedule.
Setting multiple schedules for a workflow
With triggered schedules, it is possible to trigger a schedule (workflow) from multiple other schedules that have different cron expressions.
Schedule variables
Schedule variables allow you to define and assign values to variables that are specific to the execution of a schedule.
You can:
Define new variables directly within the schedule.
Reference variables defined elsewhere (e.g. environment variables).
Assign values that will be resolved at execution time in the Runtime environment.
Schedule variables can be added in the load schedule Summary view with Add schedule variable. Define a new VARIABLE NAME or reference an existing variable, set VARIABLE VALUE.
You can combine existing environment variables and schedule variables. Variables can be referenced with the following syntax:
<variable_defined_in_config_environment_variables>
Schedule preconditions
A workflow can be configured to check the execution state of other workflows as preconditions before running.
If the defined preconditions are not met, the Workflow will skip all tasks for that run.
If the preconditions are met, the workflow will proceed to execute its tasks as configured.
Schedule preconditions can be used together with triggered schedules to implement multi-dependent workflow triggering. For example:
Workflow C is configured as a triggered schedule by both workflows A and B.
Additionally, workflow C has preconditions set to require that both A and B have completed successfully.
This ensures that workflow C is only executed once both A and B have completed successfully.
Schedule preconditions can be added in the load schedule Summary view with Add schedule precondition. Each schedule can reference one or more upstream schedules whose state is evaluated at runtime.
Precondition types
Type | Description |
---|---|
UPSTREAM_SUCCESS | The upstream schedule must have completed successfully within the given time window. |
UPSTREAM_FAILURE | The upstream schedule must have failed within the given time window. |
UPSTREAM_RUN | The upstream schedule must have run (successfully or not) within the time window. |
UPSTREAM_NOT_RUN | The upstream schedule must not have run at all within the time window. |
Time window
Specifies the number of minutes in which the upstream schedule’s execution must fall.
The end time of the upstream workflow is used for comparison.
It is recommended to keep this window relatively short, typically not more than 24 hours (1440 minutes).
Contents
Schedule configuration
Schedule configurations are managed inside a JSON array block named schedulings
.
Key | Value type | Example | Description |
---|---|---|---|
schedulingId | String | 05c70370-d550-42ce-a305-6d693038e709 | Unique identifier of the schedule. Automatically generated when a schedule is created from the Load Schedules tab. |
schedulingName | String | TAXIDATA | Name of the schedule. Supports uppercase characters from A to Z, numbers and underscores [A-Z0-9_]. An existing schedule name can be altered as the schedules are identified by the schedulingId. |
cronExpr | String | 30 2 * * * | Optional: Cron expression for the schedule. Use an environment variable if you want a different schedule in different environments. Leave blank if the workflow should only be triggered manually or by other workflows. |
loadPool | String | dag_custom_pool | Optional: Load pool the workflow will be assigned to in Workflow Orchestration. Leave blank to use |
dagGenerationMode | String | OPTIMIZED_LOAD_ORIENTED | Optional: Sets the DAG generation mode for the schedule. Leave blank to use default OPTIMIZED_ENTITY_ORIENTED. This setting overrides environment-level settings for the schedule. |
description | String | #TAXIDATA | Optional: Schedule description, supports #tags. Workflows can be filtered by tag in Workflow Orchestration. |
schedulingTimeZone | String | Europe/Helsinki | Optional: Sets the time zone for the schedule. Leave blank to use the default UTC time zone. |
priorityWeight | Integer | 2 | Optional: Sets a priority weight for the schedule. Leave blank to use the default value of 1, which gives the workflow equal priority with others. |
Example: Schedule configuration
"schedulings": [
...
{
"schedulingId": "05c70370-d550-42ce-a305-6d693038e709",
"schedulingName": "TAXIDATA",
"cronExpr": "30 2 * * *"
}
...
]
Triggered schedule configuration
Triggered schedule configurations are managed inside a JSON array block named schedulingTriggers
.
Key | Value type | Example | Description |
---|---|---|---|
schedulingId | String | 290aaefd-1a6f-4020-a6de-ebcbcf645d8f | References the schedule defined within |
triggeredSchedulingId | String | e1893414-edd3-4888-9250-3e02c7a9f300 | References the schedule defined within |
Example: Triggered schedule configuration
"schedulingTriggers": [
...
{
"schedulingId": "290aaefd-1a6f-4020-a6de-ebcbcf645d8f",
"triggeredSchedulingId": "e1893414-edd3-4888-9250-3e02c7a9f300"
}
..
]
Schedule variable configuration
Schedule variable configurations are managed inside a JSON array block named schedulingVariables
.
Key | Value type | Example | Description |
---|---|---|---|
schedulingId | String | 4a7a3646-1637-43ae-986d-24def4c94d78 | References a schedule defined within |
variableName | String | warehouse_name | New or referenced variable name. |
variableValue | String | <fina_warehouse> | Variable value set for the schedule. Can also be a variable reference, see example below. |
Example: Schedule variable configuration
"schedulingVariables": [
...
{
"schedulingId": "4a7a3646-1637-43ae-986d-24def4c94d78",
"variableName": "warehouse_name",
"variableValue": "<fina_warehouse>"
}
...
]
Schedule precondition configuration
Schedule precondition configurations are managed inside a JSON array block named schedulingPreconditions
.
Key | Value type | Example | Description |
---|---|---|---|
schedulingPreconditionId | String | 5f3e60b5-bc83-4973-b96b-be56288b0820 | Unique identifier for the schedule precondition. Generated when the precondition is created. |
schedulingId | String | 4a7a3646-1637-43ae-986d-24def4c94d78 | References the schedule in |
upstreamSchedulingId | String | d9ea2f9c-a441-42ad-af88-835c71b6c547 | References the upstream schedule defined in |
type | String | UPSTREAM_SUCCESS | Precondition type, available values:
See details above. |
enabled | Boolean | true | Controls whether the precondition is enabled or disabled. |
timeWindowMinutes | Integer | 120 | Specifies the number of minutes in which the upstream schedule’s execution must fall (end time). |
description | String | Check that workflow has finished. | Optional: Description for the precondition. |
Example: Schedule precondition configuration
"schedulingPreconditions": [
...
{
"schedulingPreconditionId": "5f3e60b5-bc83-4973-b96b-be56288b0820",
"schedulingId": "4a7a3646-1637-43ae-986d-24def4c94d78",
"upstreamSchedulingId": "d9ea2f9c-a441-42ad-af88-835c71b6c547",
"type": "UPSTREAM_SUCCESS",
"enabled": true,
"timeWindowMinutes": 120,
"description": "Check that workflow has finished."
}
...
]
Examples
Environment-specific Cron Expression
With the combination of CONFIG_ENVIRONMENT_VARIABLES and CONFIG_LOAD_SCHEDULES, you can define environment-specific cron expressions for a schedule. This allows you to run workflows in different schedules across Runtime environments (e.g. DEV, TEST, PROD).
In this example, the schedule ENVIRONMENT_BASED
:
Does not run in the DEV environment (no cron defined)
Runs every hour on the 5th minute in the PROD environment
Define an environment variable in CONFIG_ENVIRONMENT_VARIABLES:
CODE... "environments": [ { "environmentName": "DEV" }, { "environmentName": "PROD" } ], "environmentVariables": [ { "environmentName": "DEV", "variableName": "ENVIRONMENT_BASED_CRON", "variableValue": null }, { "environmentName": "PROD", "variableName": "ENVIRONMENT_BASED_CRON", "variableValue": "5 * * * *" } ] ...
Define a schedule in CONFIG_LOAD_SCHEDULES using environment variable as the cron expression:
CODE... { "schedulingId": "e4009f81-de0f-4d0f-8800-45822e44ffdf" "schedulingName": "ENVIRONMENT_BASED", "cronExpr": "<ENVIRONMENT_BASED_CRON>" } ...
Using schedule-specific warehouses in Snowflake
This example combines predefined variables, environment variables defined in CONFIG_ENVIRONMENT_VARIABLES and schedule variables (see above) to define a schedule and an environment-specific warehouse in Snowflake.
Define an environment variable for the schedule-specific warehouse and set its values per environment in CONFIG_ENVIRONMENT_VARIABLES, for example:
JSON... "environments": [ { "environmentName": "DEV" }, { "environmentName": "QA" }, { "environmentName": "PROD" } ], "environmentVariables": [ { "environmentName": "DEV", "variableName": "fina_warehouse", "variableValue": "FINA_DEV_WH" }, { "environmentName": "QA", "variableName": "fina_warehouse", "variableValue": "FINA_QA_WH" }, { "environmentName": "PROD", "variableName": "fina_warehouse", "variableValue": "FINA_PROD_WH" } ] ...
Where
fina_warehouse_name
is the environment variable and thevariableValue
values are the environment-specific warehouse names.In the load schedule Summary view, define a schedule variable with Add schedule variable:
Variable name:
warehouse_name
Variable value:
fina_warehouse
The selection refers to the predefined variable warehouse_name
and sets its value to <fina_warehouse_name>
which is defined per environment in CONFIG_ENVIRONMENT_VARIABLES.