MASKING POLICY

Target database: Snowflake, Azure SQL Database, Azure Synapse SQL, Databricks SQL

Agile Data Engine supports attribute masking policies in the listed target database management systems. Attribute masking policies can be used to protect sensitive data within a table or a view by hiding actual data values if defined access rights are not provided.

See also:

Usage

PROPERTY

MASKING POLICY: sql_code

Notes

MASKING POLICY is set on attribute level in attribute properties.

Examples

Using a masking policy in Snowflake

Masking policies are SCHEMA level objects that are created and managed in Snowflake with the following privileges:

CREATE MASKING POLICY
APPLY MASKING POLICY
APPLY ON MASKING POLICY

Example masking policy:

SQL

CREATE MASKING POLICY taxidata_dev.publish.email_mask AS (val string) RETURNS string ->
    CASE
        WHEN current_role() IN ('role_ade_dev') THEN val
        ELSE '*********'
    END;

Access to unmasked data needed in data loads
When defining a masking policy, the database user for Agile Data Engine should have a role which is able to access the unmasked data. Otherwise data loads accessing sensitive columns can only see masked data.

This is due to the policy-driven approach, which supports segregation of duties. This allows security administrators to define policies that can limit sensitive data exposure even to the owner of an object (i.e. the role with the OWNERSHIP privilege on the object, such as a table or view), who normally has full access to the underlying data.

Agile Data Engine uses Snowflake masking policies with the Snowflake USER and ROLE that have been assigned to it. Minimum, this role needs to have the APPLY ON MASKING POLICY privilege on the used masking policy so that Agile Data Engine can set/unset the masking policy for the tables and views it is managing:

SQL

GRANT APPLY ON MASKING POLICY taxidata_dev.publish.email_mask TO ROLE role_ade_dev;

Define masking policy on attribute level to attribute created_by in table F_TRIP:

PROPERTY

MASKING POLICY:
email_mask

Resulting DDL generated by Agile Data Engine:

SQL

CREATE TRANSIENT TABLE publish.F_TRIP (
    ...,
    created_by VARCHAR (50) NULL,
    ...
);

ALTER TABLE publish.F_TRIP 
ALTER (COLUMN created_by set masking policy email_mask);

Using a masking policy in Azure SQL Database and Azure Synapse SQL

In Azure SQL Database and Synapse SQL masking rules are defined on a column in a table, four types of masks are available:

Function	Description	Syntax to be used in Designer
Default	Full masking according to the data types of the designated fields.	default()
Email	Masking method that exposes the first letter of an email address and the constant suffix in the form of an email address.	email()
Random	A random masking function for use on any numeric type to mask the original value with a random value within a specified range.	random(1, 12)
Custom string	Masking method that exposes the first and last letters and adds a custom padding string in the middle. prefix,\[padding\],suffix	partial(5,"XXXXXXX",0)

The database USER for Agile Data Engine should have permissions ALTER ANY MASK and UNMASK (or CONTROL, which combines these two) on the database to be able to modify the masking policies and to ensure data loads to function correctly also with the sensitive column data involved.

Define masking policy on attribute level to attribute created_by in table F_TRIP:

PROPERTY

MASKING POLICY:
email()

Resulting DDL generated by Agile Data Engine:

SQL

CREATE TABLE [pub].[F_TRIP] (
  ...,
  [created_by] VARCHAR (50) MASKED WITH (FUNCTION = 'email()') NULL,
  ...
);

Using a masking policy in Databricks SQL

Access to unmasked data needed in data loads
When defining a masking policy, the database user for Agile Data Engine should have a role which is able to access the unmasked data. Otherwise data loads accessing sensitive columns can only see masked data.

This is due to the policy-driven approach, which supports segregation of duties. This allows security administrators to define policies that can limit sensitive data exposure even to the owner of an object (i.e. the role with the OWNERSHIP privilege on the object, such as a table or view), who normally has full access to the underlying data.

Data manipulation logic discontinuity

Databricks SQL does not support cloning a TABLE having data masking defined.

Therefore ADE can not create a rollback clone or a clone nevertheless of the TABLE before executing the schema changes or the loads ie. data manipulation and reverting probable error situation is not possible.

In Databricks SQL masking is implemented by having built-in runtime functions or other user-defined functions defined on the DBMS and then connecting them to the tables requiring them.

Define masking policy on attribute level to attribute created_by in table F_TRIP:

PROPERTY

MASKING POLICY:
pub.masked_email

Resulting DDL generated by Agile Data Engine:

SQL

CREATE TABLE pub.F_TRIP (
  ...,
  created_by VARCHAR (50) MASK pub.masked_email NULL,
  ...
);