Skip to main content
Skip table of contents

CONFIG_LOAD_SCHEDULES

CONFIG_LOAD_SCHEDULES is a configuration package used to define load schedules in Agile Data Engine. Schedules defined in this package translate into executable workflows in the Runtime environments' workflow orchestration.

Loads are assigned to schedules, and workflows are automatically generated based on:

  • The assigned schedule

  • Dependency relationships defined in the loads

See Designing Workflows for more details about how workflows are generated.

Key features include:

  • Time-based scheduling using cron expressions

  • Support for workflow triggers and preconditions to define dependencies between workflows

  • Environment-specific schedule configurations with environment variables

  • Schedule time zone selection

  • Priority control to influence workflow execution order

Terminology

In this context, the terms schedule, workflow, and DAG (Directed Acyclic Graph) are often used interchangeably.

  • A schedule is defined in CONFIG_LOAD_SCHEDULES.

  • When deployed, each schedule becomes a workflow (also referred to as a DAG) in the workflow orchestration of the Runtime environment.

  • These workflows encapsulate the execution logic and dependencies of assigned loads, driven by schedule and load configurations.


See also:


Tutorials

See the video for a quick tutorial on how to create a load schedule:

https://www.youtube.com/watch?v=eTT3BQIkUls

Usage

  1. Open the CONFIG_LOAD_SCHEDULES configuration package.

  2. Navigate to the Load Schedules tab.

  3. Configure load schedules as needed.

You can also edit the contents of the package with Show Editor. Refer to the Contents section below for details on the available configuration structure.

After making changes, the CONFIG_LOAD_SCHEDULES package will contain uncommitted changes.
Be sure to commit and deploy the package to the Runtime environments for the configuration to take effect.

Note that entity packages that reference load schedules depend on CONFIG_LOAD_SCHEDULES. Therefore, changes to CONFIG_LOAD_SCHEDULES must be deployed before or together with the related entity packages.

Schedule name cleanup during package import
Schedule names are verified and cleaned up during the package import:

  • Names will be transformed to uppercase, only characters from A to Z, numbers and underscores are allowed [A-Z0-9_].

  • All other consecutive non-allowed characters will be substituted to one underscore.

Example 1: These examples would result in the same schedule name ADE_META_SCHEDULING_CRON:

  • ADE_META_SCHEDULING_CRON

  • ade meta scheduling cron

  • Ade#meta&scheduling#Cron

Example 2: These examples would be transformed to underscores:

  • myöhäistetty_lataus → will become scheduling MY_H_ISTETTY_LATAUS

  • loading data */5 * * * 1,2,3,4,5 → will become schedule LOADING_DATA_5_1_2_3_4_5

Priority weight

The priority weight setting allows you to influence the execution order of workflows within workflow orchestration.

This setting maps to Apache Airflow’s priority_weight parameter, which determines how task instances are prioritized when worker slots are limited. Workflows with higher priority weights will be scheduled before those with lower weights, assuming all other conditions (e.g. dependencies, scheduling time) are equal.

Agile Data Engine uses the absolute weighting method by default, meaning the value assigned is directly used as-is; larger numbers equal higher priority. For example:

  • A workflow with a priority weight of 10 will be scheduled before one with a weight of 5, when competing for execution resources.

If no priority is defined, the default is 1, which gives the workflow equal priority with others.

For more details, refer to the Airflow documentation on priority weight.

Triggered schedules

Setting up triggered schedules allows you to define downstream schedules that will automatically start after the source workflow has completed.

You can add triggered schedules from the Summary view of a load schedule by selecting Add triggered schedule.

Setting multiple schedules for a workflow

With triggered schedules, it is possible to trigger a schedule (workflow) from multiple other schedules that have different cron expressions.

Schedule variables

Schedule variables allow you to define and assign values to variables that are specific to the execution of a schedule.

You can:

  • Define new variables directly within the schedule.

  • Reference variables defined elsewhere (e.g. environment variables).

  • Assign values that will be resolved at execution time in the Runtime environment.

Schedule variables can be added in the load schedule Summary view with Add schedule variable. Define a new VARIABLE NAME or reference an existing variable, set VARIABLE VALUE.

You can combine existing environment variables and schedule variables. Variables can be referenced with the following syntax:

JSON
<variable_defined_in_config_environment_variables>

Schedule preconditions

A workflow can be configured to check the execution state of other workflows as preconditions before running.

  • If the defined preconditions are not met, the Workflow will skip all tasks for that run.

  • If the preconditions are met, the workflow will proceed to execute its tasks as configured.

Schedule preconditions can be used together with triggered schedules to implement multi-dependent workflow triggering. For example:

  • Workflow C is configured as a triggered schedule by both workflows A and B.

  • Additionally, workflow C has preconditions set to require that both A and B have completed successfully.

This ensures that workflow C is only executed once both A and B have completed successfully.

Schedule preconditions can be added in the load schedule Summary view with Add schedule precondition. Each schedule can reference one or more upstream schedules whose state is evaluated at runtime.

Precondition types

Type

Description

UPSTREAM_SUCCESS

The upstream schedule must have completed successfully within the given time window.

UPSTREAM_FAILURE

The upstream schedule must have failed within the given time window.

UPSTREAM_RUN

The upstream schedule must have run (successfully or not) within the time window.

UPSTREAM_NOT_RUN

The upstream schedule must not have run at all within the time window.

Time window

  • Specifies the number of minutes in which the upstream schedule’s execution must fall.

  • The end time of the upstream workflow is used for comparison.

  • It is recommended to keep this window relatively short, typically not more than 24 hours (1440 minutes).


Contents

Schedule configuration

Schedule configurations are managed inside a JSON array block named schedulings.

Key

Value type

Example

Description

schedulingId

String

05c70370-d550-42ce-a305-6d693038e709

Unique identifier of the schedule. Automatically generated when a schedule is created from the Load Schedules tab.

schedulingName

String

TAXIDATA

Name of the schedule. Supports uppercase characters from A to Z, numbers and underscores [A-Z0-9_]. An existing schedule name can be altered as the schedules are identified by the schedulingId.

cronExpr

String

30 2 * * *

Optional: Cron expression for the schedule.

Use an environment variable if you want a different schedule in different environments.

Leave blank if the workflow should only be triggered manually or by other workflows.

loadPool

String

dag_custom_pool

Optional: Load pool the workflow will be assigned to in Workflow Orchestration.

Leave blank to use loading_default_pool.

dagGenerationMode

String

OPTIMIZED_LOAD_ORIENTED

Optional: Sets the DAG generation mode for the schedule.

Leave blank to use default OPTIMIZED_ENTITY_ORIENTED.

This setting overrides environment-level settings for the schedule.

description

String

#TAXIDATA
Executed once per day

Optional: Schedule description, supports #tags. Workflows can be filtered by tag in Workflow Orchestration.

schedulingTimeZone

String

Europe/Helsinki

Optional: Sets the time zone for the schedule.

Leave blank to use the default UTC time zone.

priorityWeight

Integer

2

Optional: Sets a priority weight for the schedule.

Leave blank to use the default value of 1, which gives the workflow equal priority with others.

Example: Schedule configuration

JSON
"schedulings": [
  ...
  {
    "schedulingId": "05c70370-d550-42ce-a305-6d693038e709",
    "schedulingName": "TAXIDATA",
    "cronExpr": "30 2 * * *"
  }
  ...
]

Triggered schedule configuration

Triggered schedule configurations are managed inside a JSON array block named schedulingTriggers.

Key

Value type

Example

Description

schedulingId

String

290aaefd-1a6f-4020-a6de-ebcbcf645d8f

References the schedule defined within schedulings that triggers another schedule.

triggeredSchedulingId

String

e1893414-edd3-4888-9250-3e02c7a9f300

References the schedule defined within schedulings that will be triggered.

Example: Triggered schedule configuration

JSON
"schedulingTriggers": [
  ...
  {
    "schedulingId": "290aaefd-1a6f-4020-a6de-ebcbcf645d8f",
    "triggeredSchedulingId": "e1893414-edd3-4888-9250-3e02c7a9f300"
  }
  ..
]

Schedule variable configuration

Schedule variable configurations are managed inside a JSON array block named schedulingVariables.

Key

Value type

Example

Description

schedulingId

String

4a7a3646-1637-43ae-986d-24def4c94d78

References a schedule defined within schedulings.

variableName

String

warehouse_name

New or referenced variable name.

variableValue

String

<fina_warehouse>

Variable value set for the schedule. Can also be a variable reference, see example below.

Example: Schedule variable configuration

JSON
"schedulingVariables": [
  ...
  {
    "schedulingId": "4a7a3646-1637-43ae-986d-24def4c94d78",
    "variableName": "warehouse_name",
    "variableValue": "<fina_warehouse>"
  }
  ...
]

Schedule precondition configuration

Schedule precondition configurations are managed inside a JSON array block named schedulingPreconditions.

Key

Value type

Example

Description

schedulingPreconditionId

String

5f3e60b5-bc83-4973-b96b-be56288b0820

Unique identifier for the schedule precondition. Generated when the precondition is created.

schedulingId

String

4a7a3646-1637-43ae-986d-24def4c94d78

References the schedule in schedulings for which the precondition is configured.

upstreamSchedulingId

String

d9ea2f9c-a441-42ad-af88-835c71b6c547

References the upstream schedule defined in schedulings that is being checked as a precondition.

type

String

UPSTREAM_SUCCESS

Precondition type, available values:

  • UPSTREAM_SUCCESS

  • UPSTREAM_FAILURE

  • UPSTREAM_RUN

  • UPSTREAM_NOT_RUN

See details above.

enabled

Boolean

true

Controls whether the precondition is enabled or disabled.

timeWindowMinutes

Integer

120

Specifies the number of minutes in which the upstream schedule’s execution must fall (end time).

description

String

Check that workflow has finished.

Optional: Description for the precondition.

Example: Schedule precondition configuration

JSON
"schedulingPreconditions": [
  ...
  {
      "schedulingPreconditionId": "5f3e60b5-bc83-4973-b96b-be56288b0820",
      "schedulingId": "4a7a3646-1637-43ae-986d-24def4c94d78",
      "upstreamSchedulingId": "d9ea2f9c-a441-42ad-af88-835c71b6c547",
      "type": "UPSTREAM_SUCCESS",
      "enabled": true,
      "timeWindowMinutes": 120,
      "description": "Check that workflow has finished."
  }
  ...
]

Examples

Environment-specific Cron Expression

With the combination of CONFIG_ENVIRONMENT_VARIABLES and CONFIG_LOAD_SCHEDULES, you can define environment-specific cron expressions for a schedule. This allows you to run workflows in different schedules across Runtime environments (e.g. DEV, TEST, PROD).

In this example, the schedule ENVIRONMENT_BASED:

  • Does not run in the DEV environment (no cron defined)

  • Runs every hour on the 5th minute in the PROD environment

  1. Define an environment variable in CONFIG_ENVIRONMENT_VARIABLES:

    CODE
    ...
    "environments": [
      {
        "environmentName": "DEV"
      },
      {
        "environmentName": "PROD"
      }
    ],
    "environmentVariables": [
      {
        "environmentName": "DEV",
        "variableName": "ENVIRONMENT_BASED_CRON",
        "variableValue": null
      },
      {
        "environmentName": "PROD",
        "variableName": "ENVIRONMENT_BASED_CRON",
        "variableValue": "5 * * * *"
      }
    ]
    ...
  2. Define a schedule in CONFIG_LOAD_SCHEDULES using environment variable as the cron expression:

    CODE
    ...
    {
      "schedulingId": "e4009f81-de0f-4d0f-8800-45822e44ffdf"
      "schedulingName": "ENVIRONMENT_BASED",
      "cronExpr": "<ENVIRONMENT_BASED_CRON>"
    }
    ...

Using schedule-specific warehouses in Snowflake

This example combines predefined variables, environment variables defined in CONFIG_ENVIRONMENT_VARIABLES and schedule variables (see above) to define a schedule and an environment-specific warehouse in Snowflake.

  1. Define an environment variable for the schedule-specific warehouse and set its values per environment in CONFIG_ENVIRONMENT_VARIABLES, for example:

    JSON
    ...
    "environments": [
      {
       "environmentName": "DEV"
      },
      {
       "environmentName": "QA"
      },
      {
       "environmentName": "PROD"
      }
    ],
    "environmentVariables": [
      {
       "environmentName": "DEV",
       "variableName": "fina_warehouse",
       "variableValue": "FINA_DEV_WH"
      },
      {
       "environmentName": "QA",
       "variableName": "fina_warehouse",
       "variableValue": "FINA_QA_WH"
      },
      {
       "environmentName": "PROD",
       "variableName": "fina_warehouse",
       "variableValue": "FINA_PROD_WH"
      }
    ]
    ...

    Where fina_warehouse_name is the environment variable and the variableValue values are the environment-specific warehouse names.

  2. In the load schedule Summary view, define a schedule variable with Add schedule variable:

    • Variable name: warehouse_name

    • Variable value: fina_warehouse

The selection refers to the predefined variable warehouse_name and sets its value to <fina_warehouse_name> which is defined per environment in CONFIG_ENVIRONMENT_VARIABLES.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.