Skip to main content
Skip table of contents

Single file load

Edition: Private

In a single file load one file is loaded into the target database at a time. Source data files are uploaded into the Agile Data Engine default file load bucket/container and notifications per file are created into the notify bucket/container.

The files have to be named according to the naming convention described in this article. Parameter values given in a file name guide Agile Data Engine in the execution of a file load.


See also:


File naming convention

The naming convention for source files:

table.{table_value}.{timestamp_value}.batch.{batch_value}.fullscanned.{fullscanned_value}.delim.{delim_value}.skiph.{skiph_value}.{format_value}.{compression_value}

Parameter details and guidelines:

Parameter

Required/ Optional

Value

Default

Applicable format

Description

table

REQUIRED

source entity name

all

Name of the SOURCE entity. If a SOURCE entity is not defined, name of the STAGE entity.

timestamp

REQUIRED

timestamp in milliseconds

all

File creation time that ensures the file name is unique.

batch

OPTIONAL

file batch id

all

Identifier of the data batch from the process that generated the file. Batch id is essential with the Run ID Logic.

fullscanned

OPTIONAL

true/false

false

all

Describes if the file represents the full data from the source or just a part of it. When set to true, the target table is truncated before the file load.

delim

OPTIONAL

comma/tab/semicolon/pipe/hash

comma

csv

Field delimiter of a CSV formatted data file.

skiph

OPTIONAL

number of header lines to skip

0

csv

Number of header rows in a CSV formatted data file which should be skipped in the file load.

format

REQUIRED

csv/json/xml/parquet/avro

all

Data file type. Note that supported formats depend on the target database product.

compression

OPTIONAL

gz/bz2/lzo

all

Data file compression. Note that supported compression methods depend on the target database product.


Notes

Use lowercase letters in folder and filenames.

OPT_FILE_FORMAT_OPTIONS can be used to override file format options given in the filename. It can also be used to define target database management system specific format options that the naming standard might not support.

OPT_DATA_FILE_LOCATION can be used to override file location in Snowflake and BigQuery. In this case the data file can be placed in a different location than the default file load bucket/container.


Examples

Single CSV file load into a staging table

SOURCE and STAGE entities created in the Designer and deployed into a Runtime environment:

NAME

TYPE

SOURCE SYSTEM

PHYSICAL TYPE

RATECODE_DESCRIPTION

SOURCE

taxidata

METADATA_ONLY

STG_RATECODE_DESCRIPTION_TAXIDATA

STAGE

TABLE

An incoming file load has to be configured for the STAGE entity where the SOURCE entity is mapped as the source. The file load has to be placed in a schedule.

Parameter values for the filename:

Parameter

Value

table

ratecode_description

timestamp

1619000684488

batch

1619000684488

fullscanned

true

delim

comma

skiph

1

format

csv

compression

gz

Filename:

table.ratecode_description.1619000684488.batch.1619000684488.fullscanned.true.delim.comma.skiph.1.csv.gz

File uploaded to (Azure):

azure://example.blob.core.windows.net/example-file-load-container/ratecode_description/table.ratecode_description.1619000684488.batch.1619000684488.fullscanned.true.delim.comma.skiph.1.csv.gz

File uploaded to (AWS):

s3://example-file-load-bucket/ratecode_description/table.ratecode_description.1619000684488.batch.1619000684488.fullscanned.true.delim.comma.skiph.1.csv.gz

Notification file uploaded to (Azure):

azure://example.blob.core.windows.net/example-notify-container/ratecode_description/table.ratecode_description.1619000684488.batch.1619000684488.fullscanned.true.delim.comma.skiph.1.csv.gz

In AWS the notification file is automatically created:

s3://example-notify-bucket/ratecode_description//table.ratecode_description.1619000684488.batch.1619000684488.fullscanned.true.delim.comma.skiph.1.csv.gz

Once deployed and enabled, the workflow is executed according to the schedule or it can be manually triggered from the Workflow Orchestration.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.