MASKING POLICY
Target database: Snowflake, Azure SQL Database, Azure Synapse SQL, Databricks SQL
Agile Data Engine supports attribute masking policies in the listed target database management systems. Attribute masking policies can be used to protect sensitive data within a table or a view by hiding actual data values if defined access rights are not provided.
See also:
Usage
MASKING POLICY: sql_code
Notes
MASKING POLICY is set on attribute level in attribute properties.
Examples
Using a masking policy in Snowflake
Masking policies are SCHEMA level objects that are created and managed in Snowflake with the following privileges:
CREATE MASKING POLICY
APPLY MASKING POLICY
APPLY ON MASKING POLICY
Example masking policy:
CREATE MASKING POLICY taxidata_dev.publish.email_mask AS (val string) RETURNS string ->
CASE
WHEN current_role() IN ('role_ade_dev') THEN val
ELSE '*********'
END;
Access to unmasked data needed in data loads
When defining a masking policy, the database user for Agile Data Engine should have a role which is able to access the unmasked data. Otherwise data loads accessing sensitive columns can only see masked data.
This is due to the policy-driven approach, which supports segregation of duties. This allows security administrators to define policies that can limit sensitive data exposure even to the owner of an object (i.e. the role with the OWNERSHIP privilege on the object, such as a table or view), who normally has full access to the underlying data.
Agile Data Engine uses Snowflake masking policies with the Snowflake USER and ROLE that have been assigned to it. Minimum, this role needs to have the APPLY ON MASKING POLICY privilege on the used masking policy so that Agile Data Engine can set/unset the masking policy for the tables and views it is managing:
GRANT APPLY ON MASKING POLICY taxidata_dev.publish.email_mask TO ROLE role_ade_dev;
Define masking policy on attribute level to attribute created_by in table F_TRIP:
MASKING POLICY:
email_mask
Resulting DDL generated by Agile Data Engine:
CREATE TRANSIENT TABLE publish.F_TRIP (
...,
created_by VARCHAR (50) NULL,
...
);
ALTER TABLE publish.F_TRIP
ALTER (COLUMN created_by set masking policy email_mask);
Using a masking policy in Azure SQL Database and Azure Synapse SQL
In Azure SQL Database and Synapse SQL masking rules are defined on a column in a table, four types of masks are available:
Function | Description | Syntax to be used in Designer |
---|---|---|
Default | Full masking according to the data types of the designated fields. | default() |
Masking method that exposes the first letter of an email address and the constant suffix in the form of an email address. | email() | |
Random | A random masking function for use on any numeric type to mask the original value with a random value within a specified range. | random(1, 12) |
Custom string | Masking method that exposes the first and last letters and adds a custom padding string in the middle. prefix,\[padding\],suffix | partial(5,"XXXXXXX",0) |
The database USER for Agile Data Engine should have permissions ALTER ANY MASK and UNMASK (or CONTROL, which combines these two) on the database to be able to modify the masking policies and to ensure data loads to function correctly also with the sensitive column data involved.
Define masking policy on attribute level to attribute created_by in table F_TRIP:
MASKING POLICY:
email()
Resulting DDL generated by Agile Data Engine:
CREATE TABLE [pub].[F_TRIP] (
...,
[created_by] VARCHAR (50) MASKED WITH (FUNCTION = 'email()') NULL,
...
);
Using a masking policy in Databricks SQL
Access to unmasked data needed in data loads
When defining a masking policy, the database user for Agile Data Engine should have a role which is able to access the unmasked data. Otherwise data loads accessing sensitive columns can only see masked data.
This is due to the policy-driven approach, which supports segregation of duties. This allows security administrators to define policies that can limit sensitive data exposure even to the owner of an object (i.e. the role with the OWNERSHIP privilege on the object, such as a table or view), who normally has full access to the underlying data.
Data manipulation logic discontinuity
Databricks SQL does not support cloning a TABLE having data masking defined.
Therefore ADE can not create a rollback clone or a clone nevertheless of the TABLE before executing the schema changes or the loads ie. data manipulation and reverting probable error situation is not possible.
In Databricks SQL masking is implemented by having built-in runtime functions or other user-defined functions defined on the DBMS and then connecting them to the tables requiring them.
Define masking policy on attribute level to attribute created_by in table F_TRIP:
MASKING POLICY:
pub.masked_email
Resulting DDL generated by Agile Data Engine:
CREATE TABLE pub.F_TRIP (
...,
created_by VARCHAR (50) MASK pub.masked_email NULL,
...
);