Skip to main content
Skip table of contents

API_CALL

Target cloud: Google Cloud

API_CALL is a load language that is used for defining load steps that do API calls to external systems. This can be used when there is a need to trigger something externally or wait and poll some external system state as part of load process.

API_CALL load steps are defined with YAML which has certain schema which allows use different features.

API_CALL supports technically following functionalities:

  • Make any HTTP call by defining

    • Url (only https allowed)

    • Method: GET, POST, PUT, DELETE

    • Headers

    • Query parameters (automatically url encoded)

    • Timeout definition

    • Content

  • Retry logic by defining

    • Retry count

    • Backoff factor, min and max delays etc

    • Rules based on

      • HTTP response codes

      • Content extraction

  • Response transformations by defining

    • Conditions used to select wanted transformation based on

      • HTTP response codes

      • Content extraction

    • Result (SUCCESS, FAILED)

    • Affected rows

    • Variables (output which can be used in later steps etc)

    • Transformations results and variables can be based on

      • HTTP response codes

      • Content extraction

      • Constants

Use of variables (environment variables, workflow variables, etc.) are supported in API_CALL load steps.


See also:


Examples

Minimal example which does simple http call to Google Cloud functions:

  • POST to given URL

  • Default timeout 300s

  • Authorization with bearer token

    • <bearer_token_from_bq_service_account> is special variable only supported when target db is big query and we can use same credentials for google function call what we use for target db

  • Simple json content

  • Maximum 4 retries with default retry logic

  • Default response transformation where 200/201 are SUCCESS and other cases are FAILED

YAML
type: HTTP
request:
  url: https://europe-north1-test-ade-gcp-ci.cloudfunctions.net/ade-ci-test-function
  method: POST 
  headers:
    Authorization: <bearer_token_from_bq_service_account>
  content: |
    {"name": "test1"}
retries:
  total: 4

More extensive example with api using polling:

Step 1: Initiating process in external system

  • POST endpoint with json content

  • Authorization with bearer token

  • Retry 401, 404, 429, 500, 502, 503, 504 error codes or timeouts of 10 seconds

  • Parse process_name variable from 200/201 response by using response transformation definition

  • If anything else comes this step fails

YAML
type: HTTP
request:
  url: https://europe-north1-ade-gcp-ci.cloudfunctions.net/ade-ci-data-processor
  timeout_seconds: 10
  method: POST
  headers:
    Content-Type: application/json
    Authorization: <bearer_token_from_bq_service_account>
  content: |
    {"process_time":"30","future_result":"SUCCESS","info":"Test call 1"}
retries:
    total: 3
    rules:
      - conditions:
          - type: VALUE_MATCHER
            source: <http_status_code>
            values: [401, 404, 429, 500, 502, 503, 504]
response:
  transformations:
    - description: In case of success we extract returned process_name value with is needed in next step polling
      conditions:
        - type: VALUE_MATCHER
          source: <http_status_code>
          values:
            - 200
            - 201
      variables:
        process_name:
          type: REGEXP_VALUE
          source: <http_response_content>
          regexp: '"process_name"\s*:\s*"([^"]+)"'

Step 2: Polling until process finishes in external system

  • GET endpoint with url containing <process_name> variable extracted in previous load step

  • Authorization with bearer token

  • Retry until we get 200 response where status is not RUNNING

    • backoff min and max are configured to same to get same interval with polling (every 5 seconds)

    • total is configured to 12 → 12 * 5s = 60s (we expect external process finish in 60 seconds)

  • Check that we get 200 response which return status SUCCESS

  • If anything else comes this step fails which indicates that actual external process executed and finished but with fail status

YAML
type: HTTP
request:
  # in url we use variable containing process name from previous step
  url: https://europe-north1-ade-gcp-ci.cloudfunctions.net/ade-ci-data-processor/<process_name>/status
  timeout_seconds: 10
  method: GET
  headers:
    # We need to change audience of token because it is otherwise taken from url and for sub resources it won't work for this cloud function
    Authorization: <bearer_token_from_bq_service_account:audience=https://europe-north1-ade-gcp-ci.cloudfunctions.net/ade-ci-data-processor>
retries:
    # we do status polling every 5 seconds and assumne process should be ready in 60 secs -> 60/5=12
    total: 12
    backoff_min_seconds: 5
    backoff_max_seconds: 5    
    # we retry polling as long as we get non 200 response or status endpoint return RUNNING state (or we hit retry total count)
    rules:
      - conditions:
          - type: VALUE_MATCHER
            source: <http_status_code>
            negate: true
            values: [200]
      - conditions:
          - type: VALUE_MATCHER
            source: <http_status_code>
            values: [200]
          - type: REGEXP_VALUE_MATCHER
            source: <http_response_content>
            regexp: 'RUNNING'
            values:
              - RUNNING
response:
  transformations:
    - description: Step is successful if 200 is returned and state is SUCCESS anymore (process has finished successfully in expected time span)
      conditions:
        - type: VALUE_MATCHER
          source: <http_status_code>
          values:
            - 200
        - type: VALUE_MATCHER
          source: <http_response_content>
          values:
           - SUCCESS

Step 3: Example of retrieving some data from external process result endpoint after process has finished

  • GET endpoint with url containing <process_name> variable extracted in previous load step

  • Authorization with bearer token

  • Default retries

  • Check that we get 200 response and parse result_info with reg exp

  • If anything else comes this step fails

YAML
type: HTTP
request:
  url: https://europe-north1-ade-gcp-ci.cloudfunctions.net/ade-ci-data-processor/<process_name>
  timeout_seconds: 10
  method: GET
  headers:
    Content-Type: application/json
    Authorization: <bearer_token_from_bq_service_account:audience=https://europe-north1--ade-gcp-ci.cloudfunctions.net/ade-ci-data-processor>
retries:
    total: 3
response:
  transformations:
    - description: In case of success extract affected rows
      conditions:
        - type: VALUE_MATCHER
          source: <http_status_code>
          values:
            - 200
      variables:
        result_info:
          type: REGEXP_VALUE
          source: <http_response_content>
          regexp: '"generation_time"\s*:\s*"([^"]+)"'
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.